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Prefaces to the First Edition 


Preface for Teachers 

Each year in the Department of English at Newcastle 
University, I am given eleven 50-minute lecture slots in which 
to introduce English phonetics and phonology to around a 
hundred students in the first semester of their first year on a 
variety of different undergraduate degree programmes, 
including English language and literature, linguistics, English 
language, modem languages, music, history and many others. 
Also included in the student body are European exchange 
undergraduates and students taking applied linguistics 
postgraduate degrees in media technology and in linguistics for 
teachers of English as a second language. 

Given the range of degree types, this is a daunting task, 
made even more difficult by the fact that a substantial minority 
of the students do not have English as their first language. In a 
typical year, the student cohort will include speakers of Arabic, 
French, Spanish, German, Greek, Japanese, Korean, Mandarin 
or Cantonese Chinese, and Thai. Many of the non-native 
speakers will have been taught RP; others will have been 
taught General American. Amongst the native speakers of 
English, very few of the students will be speakers of RP, so 
that the non-native speakers are more likely to speak RP than 
the native speakers. 

The vast majority of the student body will take their study of 
English phonetics and phonology no further, and the one factor 
which the majority of this diverse band of students shares is 
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that they have no previous knowledge of phonetics or 
phonology; the course must therefore be ab initio. 

One faces a dilemma in teaching such a course: on the one 
hand, one wants to cater to the small minority who will go on 
to study phonology at a more advanced level. On the other 
hand, one wants to introduce the subject without 
overwhelming the students with a mass of bewildering 
descriptive detail and an avalanche of seemingly arcane 
theoretical constructs. It is a moot point whether this dilemma 
can be resolved. However, this textbook was written as an 
attempt at a solution. 

It is arguable that textbooks are harder to write than 
monographs, and that the more elementary the textbook, the 
harder it is to write: one can barely write a line without being 
aware of one’s often questionable assumptions, and one has 
always to resist the temptation to question them in the body of 
the text. One continually has the sense of one’s peers looking 
over one’s shoulder and guffawing at the absurd 
oversimplifications which one is knowingly committing to print. 
But it has to be done: students have to learn to walk before 
they can leam to run; they also have to learn to crawl before 
they can learn to walk. 

Writing and using textbooks is an empirical matter: it is very 
often immediately apparent when an exercise, chapter or book 
is simply not working, for a given body of students. Almost all 
of the textbooks which I have used on the first-year Newcastle 
course described here have proved to be unsuitable for this 
type of student cohort in one way or another; mostly, they 
have contained far too much detail. I have therefore set out to 
write a very short, very simple coursebook which deliberately 
ignores a great many descriptive/theoretical complexities. 
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My aim has not been to introduce students to phonological 
theory; rather, I have sought to introduce some of the bare 
essentials of English phonetics and phonology in a manner that 
is as theory-neutral as possible. This is fundamentally 
problematic, of course, since there is no such thing as theory- 
neutral description. I have therefore decided to adopt various 
theoretical/descriptive views, such as the tongue-arch/cardinal 
vowel approach to articulatory description, the phonemic 
approach to segmental phonology, the trochaic approach to 
English foot structure, and so on, on the purely pragmatic basis 
of what I have found to be easiest to convey to the students. 

I have ignored acoustic phonetics for the very simple reason 
that our department lacks a phonetics lab, and I have not 
included distinctive features, since the mere sight of arrays of 
features marked with ‘+’ and symbols seems to render 
large numbers of my first-year students dizzy (particularly 
those majoring in English literature). I have also excluded 
feature geometry, the mora, under-specification and a great 
many other theoretical/descriptive notions, in an attempt to 
pare the subject down to a bare minimum of these. 

The first four chapters are deliberately very short indeed, 
and contain only the most elementary introduction to 
articulatory phonetics. My aim there is to offer the student a 
gentle introduction to the course. I have spread the introduction 
of the phonemic principle over two chapters, since, in my 
experience, students find their first encounter with these ideas 
something of a quantum leap. The chapters on word stress, 
rhythm, connected speech phenomena and accent variation 
contain a very stripped-down, minimal, account of those 
subjects; I hope that there is enough there to act as a 
foundation for those students who wish to study these matters 
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in more depth. In the chapter on syllable structure, I have been 
a little more ambitious in introducing analytical complexity, on 
the assumption that syllable structure is something that 
beginning students seem to be able to get the hang of more 
easily than, say, rhythm or intonation. 

I believe that one of the most important duties of a university 
teacher is to induce in the student a sense of critical awareness, 
a grasp of argumentation and the role of evidence. On the 
other hand, one has to be very wary of introducing students at 
the most elementary stage to the idea of competing analyses: 
they find it difficult enough to get the hang of one sort of 
analysis, without being asked to assess the merits and demerits 
of competing analyses (even at the post-elementary stage, most 
undergraduates are very resistant to the idea of critically 
comparing different analyses). I have tried to overcome this 
dilemma by introducing competing analyses and assumptions at 
one or two points, while consciously ignoring them elsewhere. 

The exercises are meant to be discussed at weekly 
seminar/tutorial meetings; my experience is that, if 
phonetics/phonology students are not made to do exercises, 
they easily come to believe that they have grasped the subject 
when in fact they have not. It is my hope that students who 
have completed this course would find it possible to tackle 
more advanced textbook treatments of these topics, such as 
those given by Giegerich (1992) and Spencer (1996). Whether 
that hope is fulfilled is, of course, very much an empirical 
matter. 


Preface for Students 
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This is an elementary introduction to English phonetics and 
phonology, designed for those who have no previous 
knowledge whatsoever of the subject. It begins with a very 
elementary introduction to articulatory phonetics, and then 
proceeds to introduce the student to a very simplified account 
of some of the main aspects of the phonological structure of 
present-day English. 

It is arguable that there are two main questions one might 
ask in studying the English language: what is it about English 
that makes it a language (as opposed to, say, a non-human 
communication system), and what is it about English that 
makes it English (as opposed to, say, French or Korean)? This 
book attempts to provide the beginnings of an answer to both 
of those questions, with respect to one aspect of English: its 
phonology. 

Thus, although the subject matter of this book is English, 
there is reference to the phonology of other languages at 
several points, often in contrastive exercises which are 
designed to bring out one or more differences between English 
and another language. These contrastive exercises are included 
because native speakers of English, who often have little or no 
detailed knowledge of other languages, tend to assume that the 
phonology of English is the way it is as a matter of natural fact, 
a matter of necessity. For many such speakers, it will seem 
somehow natural, for instance, that the presence of the sound 
[f] as opposed to [v] functions to signal a difference in meaning 
(as in fan vs van). To the English speaker, [f] and [v] will 
therefore seem easily distinguishable, and that too will appear 
to be a natural fact. But the fact that these sounds have that 
function in English is a conventional, not a necessary or natural 
fact: English need not have been that way, and may not always 
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be that way. Just as one can gain a new perspective on one’s 
own culture by learning about other cultures, so one can gain a 
fresh perspective on one’s native language by learning a little 
about other languages. One can also, in learning about other 
cultures, gain some sense of what human cultures are like. 
Similarly, one can begin to get a sense of what human language 
phonologies are like by learning in what respects they resemble 
each other. Those points of resemblance concern general 
organizational properties of human language phonologies, such 
as the phonemic principle and the principles of syllable 
structure. 

Reading a textbook on linguistic analysis is not like reading a 
novel. It is vital that the student complete the exercises at the 
end of each chapter before proceeding to the next chapter: they 
are designed to get the student to apply the ideas introduced in 
the chapter. The reader will not have properly grasped the 
ideas contained in this, or any other, textbook on phonology by 
simply sitting back in an armchair and reading the text, even if 
the student is under the impression of having understood the 
ideas. Vast numbers of students who have attempted to master 
linguistic analysis without actually doing it have ended up with 
disastrous exam results: no one ever became any good at 
linguistic analysis without actually doing it. 

Like most linguistics textbooks, this book is cumulative in 
nature: what has been introduced in earlier chapters is 
presupposed in later chapters. It is fatal, therefore, to let 
several weeks go by without doing the reading and the 
exercises, in the hope of catching up later: the result is very 
likely to be that you will simply find yourself out of your 
depth, even though this is an elementary textbook. It is simply 
not possible to dip in and out of a linguistic analysis textbook, 
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no matter how basic, in the way that one might dip in and out 
of a dictionary or an encyclopedia. 

This book is designed to cater for students who, in all 
probability, will not pursue their studies in English phonetics 
and phonology any further. However, students who will be 
proceeding to a more advanced level should be able to tackle 
more advanced textbook treatments of these topics, such as 
those given by Giegerich and by Spencer (see Suggested 
Further Reading at the end of the book). Those students 
should also find it easier to tackle one of the many 
introductions to general phonological theory which are not 
focused on English (again, see Suggested Further Reading). In 
order to prepare such students for more advanced study, I 
have introduced, at some points, an indication of some of the 
difficulties with some of the assumptions made in this 
textbook, or a brief discussion of competing analyses. Although 
this textbook merely scratches the surface of the subject 
matter, I hope that there is enough here to make the subject of 
phonology seem intriguing to the student who intends to pursue 
his or her studies. 

It is my hope that this book will be of some use to teachers 
of English as a foreign language, although it is not designed 
specifically for such readers. I am always surprised to discover 
how little in the way of knowledge of English phonetics and 
phonology such teachers often have. I have no experience of 
such teaching, and while I make no suggestions as to how the 
notions introduced in this book might be put to use in the 
TEFL classroom, I find it hard to believe that a knowledge of 
the basics of English phonetics and phonology could fail to be 
useful to the TEFL teacher in some way, even if only as 
background knowledge which extends the teacher’s knowledge 
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of English. I also hope that some of the contrastive exercises 
might help suggest ways in which one’s native language 
phonology can interfere with one’s attempt to acquire English 
as a second language. 

Newcastle, February 1999 
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Preface to the Second Edition 


The first edition of this book was written while I was teaching 
in an English university. Since then, I have moved to the 
English department at Montpellier University, in France. While 
I always had non-native speakers of English in my classes at 
Newcastle University, most of my students were native 
speakers of English; now, the vast majority of my students are 
not native speakers of English. Most are French, but there are 
also Spanish, Portuguese, Greek, German, Dutch, Polish, 
Russian and Bulgarian students, among others. The book has 
changed as a result: it is more orientated towards learners of 
English as a foreign language, but it is still useful for native 
speakers, I believe. 

The main changes to the text concern the later chapters: 
chapters 8, 9 and 10 have been entirely rewritten, and there is 
a new chapter (chapter 11) on the relationship between spelling 
and pronunciation, known as grapho-phonemics. Teachers 
whose students are native speakers of English may choose to 
skip this chapter, but it could prove useful for students who 
wish to go on to teach English as a foreign language. I have 
expanded the appendix (renamed as chapter 13) to cover 
additional varieties of English. There are now sound files which 
accompany exercises, the treatment of intonation, and the 
description of some of the varieties of En glis h given here: these 
are marked in the margins with a headphones symbol. 



I have insisted on retaining practice at phonetic transcription, 
for two reasons. Firstly, I believe that it reinforces the 
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distinction between phonetic transcription, based on listening to 
speech sounds, and phonological analysis, in which phonemes 
(as conceived of here) are not speech sounds, and cannot be 
heard. Secondly, I hope that some readers of this book will go 
on to engage in the empirical study of varieties of English, 
which typically involves both listening carefully to, and 
phonetically transcribing, recordings of speakers of various 
accents, and also engaging with theoretical issues in the 
analysis of those accents. The phonetic transcription exercises 
are now based on audio recordings. 

The book is not intended as an introduction to phonological 
theory; some books of that sort are listed in the Suggested 
Further Reading. Inevitably, I have had to draw on notions 
proposed in various theoretical frameworks. Any proposed 
distinction between theory and description is fraught with 
difficulties: there can be no description without theoretical 
assumptions, as the philosopher of science Karl Popper pointed 
out. However, in my view, some kind of distinction between 
theory and description must be upheld. My aims here are 
primarily descriptive. 

Any queries and/or corrections can be sent to: 
philip.carr@univ-montp3.fr 

Montpellier, December 2011 
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Figure 1 The organs of speech 



1 Lips 

3 Alveolar ridge 
5 Soft palate (velum) 
7 Tip of the tongue 
9 Front of the tongue 
11 Nasal cavity 
13 Pharynx 


2 Teeth 
4 Hard palate 
6 Uvula 

8 Blade of the tongue 
10 Back of the tongue 
12 Oral cavity 
14 Larynx 
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Figure 2 The International Phonetic Alphabet (Department of 
Theoretical and Applied Linguistics, School of English, 
Aristotle University of Thessaloniki, Thessaloniki 54124, 
Greece) 
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English Phonetics: Consonants 

0) 

1.1 Airstream and 
Articulation 

Speech sounds are made by modifying an airstream. The 
airstream we will be concerned with in this book involves the 
passage of air from the lungs out through the oral and nasal 
cavities (see figure 1 ). There are many points at which that 
stream of air can be modified, and several ways in which it can 
be modified (i.e. constricted in some way). The first point at 
which the flow of air can be modified, as it passes from the 
lungs, is in the larynx (you can feel the front of this, the 
Adam’s apple, protruding slightly at the front of your throat; 
see figure 1 ). in which are located the vocal folds (or vocal 
cords). The vocal folds may he open, in which case the 
airstream passes through them unimpeded. Viewed from 
above, the vocal folds, when they he open, look like this: 

Open vocal folds 
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The vocal folds may be brought together so that they are 
closed, and no air may flow through them from the lungs: 

Closed vocal folds 



One way in which the outgoing stream of air may be 
modified is by applying a certain level of constant muscular 
pressure sufficient to close the vocal folds along their length, 
but only just; the build-up of air pressure underneath this 
closure is sufficient, given the degree of muscular pressure, to 
force that closure open, but the air pressure then drops, and 
the muscular pressure causes the folds to close again. The 
sequence is then repeated, very rapidly, and results in what is 
called vocal fold vibration. You should be able to feel this 
vibration if you put your fingers to your larynx and produce the 
sound which is written as <z> in the word hazy (although you 
will probably also feel vibration elsewhere in your head). 
Sounds which are produced with this vocal fold vibration are 
said to be voiced sounds, whereas sounds produced without 
such vibration are said to be voiceless. 
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To transcribe speech sounds, phoneticians use the 
International Phonetic Alphabet (the IPA: see figure 2 1: the IPA 
symbol for the sound written <z> in hazy is [z]. You should be 
able to feel the presence of vibration in [z] if you put your 
fingers to your larynx and produce [z], then [s] (as in miss), 
then [z] again: [z] is voiced, whereas [s] is voiceless. This 
distinction will constitute the first of three descriptive 
parameters by means of which we will describe a given 
consonantal speech sound: we will say, for any given 
consonant, whether it is voiced or voiceless. 


1.2 Place of Articulation 

We will refer to the points at which the flow of air can be 
modified as places of articulation. We have just identified the 
vocal folds as a place of articulation; since the space between 
the vocal cords is referred to as the glottis, we will refer to 
sounds produced at this place of articulation as glottal sounds. 
There are many other places of articulation; we will identify a 
further seven. 

Firstly, sounds in which the airflow is modified by forming a 
constriction between the lower lip and the upper lip are 
referred to as bilabial sounds. An example is the first sound in 
pit. 

A bilabial sound: the first sound in pit 
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Secondly, sounds in which there is a constriction between 
the lower lip and the upper teeth are referred to as labio¬ 
dental sounds. An example is the first sound in fit. 



Thirdly, sounds in which there is a constriction between the 
tip of the tongue and the upper teeth are referred to as dental 
sounds. An example is the first sound in thin. 

A dental sound: the first sound in thin 
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For the remaining places of articulation, let us distinguish 
between the tip, the blade of the tongue, the front of the 
tongue and the back of the tongue (as in figure 1 ). Let us also 
distinguish various points along the upper part of the mouth. 
We will identify four different areas: the alveolar ridge (the 
hard, bony ridge behind the teeth; see figure O . the hard 
palate (the hard, bony part of the roof of the mouth; see figure 
1), the palato-alveolar (or post-alveolar) region- (the area in 
between the alveolar ridge and the hard palate), and the velum 
(the soft part at the back of the roof of the mouth, also known 
as the soft palate; see figure 1 ). 

Sounds in which there is a constriction between the blade or 
tip of the tongue and the alveolar ridge are called alveolar 
sounds. An example is the first sound in sin. 

An alveolar sound: the first sound in sin 
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Sounds in which there is a constriction between the blade of 
the tongue and the palato-alveolar (or post-alveolar) region are 
called palato-alveolar sounds. An example is the first sound in 
ship. 



Sounds in which there is a constriction between the front of 
the tongue and the hard palate are called palatal sounds. An 
example is the first sound in yes (although this may be less 
obvious to you; we will return to this sound below). 


40 


A palatal sound: the first sound in yes 



Sounds in which there is a constriction between the back of 
the tongue and the velum are called velar sounds. An example 
is the first sound in cool. 

A velar sound: the first sound in cool 
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1.3 Manner of Articulation: 
Stops, Fricatives and 
Approximants 

We have now identified eight places of articulation: glottal, 
bilabial, labio-dental, dental, alveolar, palato-alveolar, palatal 
and velar. For any given sound we will say whether it is voiced 
or voiceless, and what its place of articulation is. But to 
distinguish between the full range of speech sounds, we will 
require a third descriptive parameter: manner of articulation. 
To identify the manner in which a sound is articulated, we will 
identify three different degrees of constriction (complete 
closure, close approximation and open approximation), and 
thus three different categories of consonant: stops, fricatives 
and approximants. 

1.3.1 Stops 

The articulators in question may form a stricture of complete 
closure; this is what happens when one produces the first 
sound in pit. Here the lower and upper lips completely block 
the flow of air from the lungs; that closure may then be 
released, as it is in pit, and may then produce a sudden outflow 
of air. Sounds which are produced with complete closure are 
referred to as stops (or plosives). 

We may describe the first sound in pit as a voiceless bilabial 
stop (transcribed as [p]) and we will henceforth identify all 
consonants with three-term labels of this sort. The consonant 
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in abbey is also a bilabial stop, but differs from that in pit : it is 
voiced. This consonant (transcribed as [b]) is a voiced bilabial 
stop. 

The first sound in tin is a voiceless alveolar stop; it is 
transcribed as [t]. Its voiced counterpart is the consonant in 
ado. This sound, the voiced alveolar stop, is transcribed as [d]. 

The first sound in cool is a voiceless velar stop; it is 
transcribed as [k]. Its voiced counterpart, the voiced velar stop, 
is transcribed as [g]; an example is the consonant in ago. 

We have now identified bilabial, alveolar and velar stops; 
stops may be made at many other places of articulation, but we 
will ignore those, as they are not relevant to the study of 
English. There is one further stop which we must mention, 
however, as it is very common in the speech of most speakers 
of English. This is the glottal stop (transcribed as [?]). It is 
made by forming a constriction of complete closure between 
the vocal folds. This is the sound made instead of [t] in many 
Scottish and Cockney pronunciations of, for example, the word 
butter. We will see that it is present in the speech of almost 
every speaker of English, no matter what the accent. There is 
no question of describing the glottal stop as voiced or voiceless, 
since it is articulated in the glottis itself. 

1.3.2 Fricatives 

Let us now distinguish between complete closure and another, 
less extreme, degree of constriction: close approximation. 
Sounds which are produced with this kind of constriction entail 
a bringing together of the two articulators to the point where 
the airflow is not quite fully blocked: enough of a gap remains 
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for air to escape, but the articulators are so close together that 
friction is created as the air escapes. Sounds of this sort are 
referred to as fricatives. 

The first sound in fin is created by bringing the lower lip 
close to the upper teeth in a constriction of close 
approximation. This sound is a voiceless labio-dental fricative 
(transcribed as [f]). Its voiced counterpart (the voiced labio¬ 
dental fricative, transcribed as [v]) is the consonant in Eva. 

The first sound in thin is created by bringing the tip of the 
tongue into a constriction of close approximation with the 
upper teeth. This sound is a voiceless dental fricative, 
transcribed as [0]. Its voiced counterpart, the voiced dental 
fricative (transcribed as [ 6 ]) is, for some speakers, the first 
sound in the word that? 

The first sound in sin is created by bringing the tip or blade 
of the tongue into a constriction of close approximation with 
the alveolar ridge. This sound, transcribed as [s], is a voiceless 
alveolar fricative. Its voiced counterpart, the voiced alveolar 
fricative (transcribed as [z]) is the consonant in zoo. 

The first sound in ship is created by bringing the blade of the 
tongue into a constriction of close approximation with the 
palato-alveolar region. This sound, transcribed as [f], is a 
voiceless palato-alveolar fricative. Its voiced counterpart, 
transcribed as [ 3 ], is the second consonant in seizure. 

Fricatives may be articulated at any point of articulation, but 
many of those sounds are irrelevant to the study of English. 
However, we will mention three. 

One is the voiceless velar fricative [x], found in the speech 


44 


of many Scots, in words such as loch. Another is the voiceless 
fricative [m], again found in the speech of many Scots, as in 
words like whale (as opposed to wail ) and which (as opposed 
to witch)] its place of articulation is labial-velar (explained in 
1.3.3). 

A third is the glottal fricative [h], as in the first sound in hit. 
This sound is produced by bringing the vocal cords into a 
constriction of close approximation, so that friction is 
produced. As the vocal cords are not vibrating, we will take it 
that this is a voiceless sound. 

1.3.3 Approximants 

The least radical degree of constriction occurs when the 
articulators come fairly close together, but not sufficiently close 
together to create friction. This kind of stricture is called open 
approximation. Consonants produced in this way are called 

approximants. 

The first sound in yes is an approximant. It is produced by 
bringing the front of the tongue close to the hard palate. 
Although the sides of the tongue are in a constriction of 
complete closure with the upper gums, the air escapes along a 
central groove in which the front of the tongue is not close 
enough to the hard palate to create friction. This sound, 
transcribed as [j], is a voiced palatal approximant. 
Approximants are normally voiced, so we will not discuss any 
voiceless counterparts for these sounds. 

The first sound in many English speakers’ pronunciation of 
rip, rope, rat, etc. is an approximant. It is produced by bringing 
the blade of the tongue into a constriction of open 
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approximation with the alveolar ridge. This approximant, 
transcribed as [j], is referred to as an alveolar approximant. As 
with [j], the sides of the tongue form a constriction of complete 
closure with the gums at the sides of the mouth, but the air 
escapes along a central groove without creating friction. For 
most speakers (and in varying degrees, depending on the 
accent), the tongue body is somewhat retracted when [j] is 
uttered; it is therefore often referred to as a post-alveolar 
approximant, but ‘alveolar approximant’ will suffice for our 
purposes. - 

We will be looking at more English approximants in chapter 
2. For the moment, let us identify one further such sound, the 
sound at the beginning of wet. In producing this sound, the bps 
form a constriction of open approximation: there is no friction 
produced. But its articulation is more complicated than that of 
[j], the palatal approximant, since it also involves another 
articulation, between the back of the tongue and the velum (i.e. 
a velar articulation). We will therefore refer to it as a voiced 
labial-velar approximant; it is transcribed as [w]. 


Notes 

1 Many phonologists and phoneticians use the term ‘palato- 
alveolar’, but the chart of symbols used by the International 
Phonetics Association uses the term ‘post-alveolar’. It will 
suffice for our purposes if the student takes the two terms to 
be interchangeable. There are no rigid physiological divisions 
between the alveolar ridge and the hard palate; the transition 
from one to the other is a continuum. And the range of 
articulations which can be made in between the two is 
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relatively varied, leading some phoneticians to distinguish 
alveo-palatal from palato-alveolar articulations. We will 
simplify by ignoring these details. 

2 Many speakers of English do not have a voiced dental 
fricative; rather, the sound lacks friction: it is a voiced dental 
approximant. 

3 The articulation of an [i] kind of articulation in some 
American and West Country accents is also referred to by 
some as retroflex approximant. The term ‘retroflex’ means 
that the blade and tip of the tongue are curled upwards and 
backwards to some extent, so that the underside of a part of 
the tongue forms the relevant articulation. Somewhat 
inaccurately, we will use [j] for these sounds. 

Exercises 

1 Give the appropriate three-term description for each of 
the following sounds (e.g. [k]: voiceless velar stop): 

[0] [b] [fi in m w 

2 Give the appropriate phonetic symbol for each of the 
following sounds: 

(a) a voiced palato-alveolar fricative 

(b) a voiced alveolar stop 

(c) a voiced velar stop 

(d) a voiced dental fricative 

(e) a voiced labio-dental fricative 

3 What phonetic property distinguishes each of the 
following pairs of sounds (e.g. [p] and [b]: voicing; [s] and 
[fl: place of articulation; [t] and [s]: manner of 
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articulation)? 

(a) [k] and [g] 

(b) [b] and [d] 

(c) [d] and [z] 

(d) [z] and [ 3 ] 

(e) [f] and [ 3 ] 

(f) [d] and [g] 



Listen to sound 
files online 

4 Listen to Track 1.1 at www. wilev. com/go/carrphonetics . 
Which of the words on the recording begin with a 
fricative? The words are listed below. 

ship psychology veer round plot philosophy thin 

5 Listen to Track 1.2 . Which of the words on the 
recording end with a fricative? The words are listed below. 

stack whale swim epitaph half halve hash haze 

6 Listen to Track L3 . Which of the words on the 
recording begin with a stop? The words are listed below. 

philanderer plasterer parsimonious ptarmigan psyc 
ghoulish gruelling guardian thick tickle bin drea 

7 Describe the position and action of the articulators 
during the production of the following sounds (e.g. [d]: the 
blade of the tongue forms a constriction of complete 
closure with the alveolar ridge; the vocal cords are 
vibrating): 

[V] [0] [k] [b] 
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2 


English Phonetics: Consonants 

(ii) 

2.1 Central vs Lateral 

In discussing the alveolar approximant [i], we said that the air 
escapes along a central groove (of the tongue, in this case; the 
same kind of groove can be formed by the lips). This is true 
for all of the fricatives and approximants described in chapter 
1: they are all central fricatives and approximants. However, it 
is possible to produce fricatives and approximants in which this 
is not the case. For instance, in the first sound in lift, the centre 
of the blade of the tongue forms a stricture of complete closure 
with part of the alveolar ridge, but the articulation which 
‘counts’ is that between the sides of the tongue and the 
alveolar ridge. Since the sides of the tongue form a constriction 
of open articulation with the alveolar ridge, and no friction is 
created, we refer to this sound (transcribed as [1]) as a voiced 
alveolar lateral approximant. Since English fricatives and 
approximants are typically central, we will use the term ‘lateral’ 
for laterals, and omit the term ‘central’ in describing central 
fricatives and approximants in English speech. The sounds [1] 
and [j] are, clearly, quite similar: both are approximants, both 
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are voiced, both are alveolar. The principal difference is that 
the former is lateral and the latter central.- 


2.2 Taps and Tr ills 

We have said that, for a great many speakers of English, the 
sound at the beginning of words such as rat, rope, reap, etc. is 
a post-alveolar approximant: [j]. The same is true of the sound 
which occurs after stops in words such as prude, true, creep, 
etc. However, some speakers utter, not an approximant, but a 
sound which is very like a voiced alveolar stop of very short 
duration. Many Scots utter this sound, rather than [i], after 
stops, as in the words just cited. During the articulation of this 
sound, the blade of the tongue comes into a momentary 
constriction of complete closure with the alveolar ridge. This 
sound, transcribed as [r], is referred to as a voiced alveolar tap 
(or flap). This is also the sound that many American speakers 
have instead of [t] or [d] in words such as Betty, witty, rider, 
heady, etc. 

Speakers of certain accents of English may utter neither an 
[r] nor an [j] in words such as rat, rope, reap and prude, true, 
creep, but a sound referred to as a voiced alveolar trill. Trills 
are produced by holding one articulator (e.g. the blade of the 
tongue) next to the other (e.g. the alveolar ridge) in a 
constriction of complete closure, but without the same 
muscular pressure as one finds in stops. The result is that air 
pressure builds up behind the closure and forces it open; the air 
pressure then reduces, and the muscular pressure again creates 
a constriction of complete closure. This sequence may be 
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repeated in quick succession, producing, in the case of an 
alveolar trill, a series of taps of the tongue against the alveolar 
ridge. The alveolar trill is transcribed as [r], but is relatively 
rare. Scots are often said to produce this sound; however, most 
speakers of Scottish varieties of English typically produce, not 
an alveolar trill, but an alveolar tap. 

2.3 Secondary Articulation 

We have said that the lateral approximant [1] is alveolar. 
However, laterals may also be produced with an additional 
articulation, such as one formed between the back of the 
tongue and the velum, i.e. a velar articulation. When this 
happens, we may distinguish between the alveolar articulation 
as the primary articulation and the velar one as the 
secondary articulation. Where a secondary articulation is 
velar, this process is referred to as velarization: we say that 
the lateral is velarized. A velarized lateral approximant is 
transcribed using the velarization diacritic, thus: [1], This sound 
is often referred to as ‘dark 1 ’.- Where a secondary articulation 
is palatal (formed between the front of the tongue and the hard 
palate), this process is referred to as palatalization; we say 
that the lateral is palatalized. A palatalized lateral is transcribed 
using the palatalization diacritic, thus: [P], The term ‘clear 1’ is 
often used to refer to [P], or to [1] (neither palatalized nor 
‘dark’). In subsequent chapters, we will consider the status of 
‘dark 1’ and ‘clear V in different accents of English. 
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2.4 Affricates 

We have, thus far, distinguished three classes of consonant 
according to degree of constriction: stops, fricatives and 
approximants. Consider the first sound in chip: it is like a stop 
in that there is complete closure between the blade of the 
tongue and the palato-alveolar region. However, it is like a 
fricative in that it clearly involves friction. That friction occurs 
during the release phase of the closure, which we referred to in 
1.3.1. Sounds produced with a constriction of complete closure 
followed by a release phase in which friction occurs are called 
affricates. We might say that one of the main differences 
(place of articulation apart) between the first sound in tip and 
the first sound in chip is that, during the release phase of the [t] 
in tip, there is no friction of the sort one finds during the 
release phase of the first sound in chip. We might therefore 
think of affricates as stops with a slow, fricative, release phase. 
The affricate in chip is a voiceless palato-alveolar affricate, 
transcribed as [tj]. Its voiced counterpart is [d 3 ], the first 
sound in jury, joy, etc.- 

These two affricates occur in the speech of most speakers of 
English. In later chapters, we will examine some other 
affricates which occur in the speech of speakers of certain 
accents of English. 

2.5 Aspiration 

The first stop in pit, we said, is a voiceless bilabial stop. So too 
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is the first stop in spit. But the bilabial stop in pit differs 
phonetically from the bilabial stop in spit: if you hold the palm 
of your hand up close to your mouth when uttering pit, you 
will feel a stronger puff of air on releasing the bilabial stop than 
you will when you utter spit. That ‘stronger puff of air’ 
phenomenon is called aspiration: we say that the bilabial stop 
in pit is an aspirated voiceless stop, whereas the stop in spit is 
unaspirated. Aspirated voiceless stops are transcribed with the 
aspiration diacritic ([ h ]), so that the bilabial stop in pit is 
transcribed as [p h ]. Unaspirated stops are transcribed without 
that diacritic, so that the bilabial stop in spit is transcribed as 

[p] 

2.6 Nasal Stops 

We have been making an assumption in our discussion thus far, 
concerning the position of the velum in the production of the 
speech sounds we have described. We have assumed that, in 
all of these sounds, the air from the lungs is escaping only 
through the mouth (the oral cavity). This is true if the velum is 
in the raised position, such that it prevents the flow of air out 
through the nasal cavity (see figure I T In all of the sounds 
discussed thus far, the velum is indeed raised: we describe all 
such sounds as oral sounds. But the velum may be lowered, to 
allow escape of air through the nasal cavity (see figure 1 ). 
Sounds produced with the velum lowered, and with air 
escaping through the nasal cavity alone, are referred to as 
nasal stops.- These may occur at most places of articulation; 
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let us consider those which are relevant for the study of 
English. 

While nasal stops may be either voiced or voiceless, they are 
typically voiced in most human languages; we will therefore 
ignore voiceless nasal stops and use the term ‘nasal stop’ to 
imply ‘voiced nasal stop’. 

Bilabial nasal stops (transcribed [m]) entail, as one would 
expect, complete closure between the lips, voicing, and escape 
of the air through the nasal cavity. An example is the first 
consonant in map. 

Labio-dental nasal stops (transcribed [ 113 ]) entail complete 
closure between the lower lip and the upper teeth, voicing, and 
escape of the air through the nasal cavity. An example is the 
second consonant in pamphlet. In English, they occur before 
labio-dental sounds, as in this case. The nasal stop articulation 
in cases such as these reflects a process of assimilation. 
Assimilation processes are processes in which one sound 
becomes similar to an adjacent sound. In this case, the nasal is 
assimilated to the following fricative, in the sense that it ‘takes 
on’ the place of articulation of the fricative. Such processes 
involve a principle of ease of articulation. In this case, if the 
nasal in pamphlet is articulated at the same place as the 
following fricative, this saves the speaker the articulatory effort 
of moving from a bilabial to a labio-dental articulation. We will 
return to such processes in chapter 6 . 

Dental nasal stops (transcribed as [n]) entail complete 
closure between the tip of the tongue and the upper teeth, 
voicing, and escape of the air through the nasal cavity. An 
example is the second consonant in tenth. As in this case, they 
occur before other dental sounds, and this too is a matter of 
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assimilation involving place of articulation. 

Alveolar nasal stops (transcribed as [n]) entail complete 
closure between the blade of the tongue and the alveolar ridge, 
voicing, and escape of the air through the nasal cavity. An 
example is the first sound in not. 

Velar nasal stops (transcribed as [ij]) entail complete closure 
between the back of the tongue and the velum, voicing, and 
escape of the air through the nasal cavity. An example is the 
last sound in sing or the nasal stop as it is often articulated 
(especially in faster or more casual speech styles) in the word 
incredible. Once again, the latter case involves assimilation. 


Notes 

1 The central approximant [i] also differs from [1] in having 
tongue body retraction and lip rounding. We will see shortly 
that alveolar laterals may be produced with retraction too. 

2 The term ‘dark V can also be used to refer to lateral 
approximants in which the body/back of the tongue is 
retracted and/or lowered. Accents of English vary with 
respect to the exact articulatory nature of their ‘dark l’s: some 
are velarized, while others have no velar articulation, but 
have, instead, retraction and/or lowering of the back/body of 
the tongue. Such retraction can lead to loss of alveolar 
contact, and thus to [1]-vocalization, in which the articulation 
becomes vowel-like. 

3 Some authors transcribe [tj] as [c] and [d 3 ] as []]. We 
should, if we were to stick strictly to the conventions of the 
International Phonetics Association, transcribe both affricates 
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with a ‘tie bar’ above the two symbols; we depart here from 
the conventions of the IPA chart, which does not contain an 
‘affricate’ category. 

4 The term ‘nasalized’, as opposed to ‘nasal’, is used to 
describe sounds in which air escapes through both cavities, 
the oral and the nasal. The term ‘nasal’ is used to describe 
sounds in which the air escapes through the nasal cavity 
alone. 

Exercises 

0 

Listen to sound 
tili-s online 

1 Listen to Track 2.1 at www.wilev.com/go/carrphonetics . 
For each of the words on the recording, identify (a) any 
oral stops, (b) any fricatives, (c) any approximants, (d) 
any affricates and (e) any nasals. For each sound that you 
identify, say whether it is voiced or voiceless and what its 
place of articulation is (e.g. the word stop: voiceless 
alveolar stop [t] and voiceless bilabial stop [p]; voiceless 
alveolar fricative [s]; no approximants, affricates or 
nasals). The words are: 

bring licking fever thinking assure measure heat 

2 Listen to Track 2.2 . Which of the words on the 
recording begins with an affricate, and which (if any) with 
a stop? 

tune chip dune June 

Many speakers of English typically utter words like tune 
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and dune with an affricate at the beginning of the word. 
This means that dune and June are typically 
indistinguishable. None the less, when asked in a 
phonetics class whether they utter words such as dune 
with an affricate, such speakers often deny that they do. 
These speakers typically have a more careful 
pronunciation of words such as tune and dune, in which 
there is a [j]. 

Notice, however, that there is no such more careful 
pronunciation of words like chip and June: one never 
hears these pronounced with [tj] and [dj]. In order to 
explain the difference between dune and June, we need to 
say that the speaker in some sense intends to utter [dj] in 
dune, but that ease of articulation results in a palato- 
alveolar affricated release of the stop closure, rather than a 
transition from an alveolar closure to a stricture of open 
approximation between the front of the tongue and the 
hard palate. In the case of June and chip, the intended 
articulation is a palato-alveolar affricate. 

If you are a speaker of General American, you may well 
never utter a [j] in words like tune and dune, in which 
case you will utter a stop followed by a vowel. However, 
you may well also have been told at school that the 
‘correct’ pronunciation of such words has a [j] after the 
stop. Your speech may well vary with respect to the 
presence or absence of the [j]. If your speech does vary in 
this way, how do you pronounce noon ? 



57 


3 Listen to Track 2.3 . Give a phonetic transcription of 
each of the words on the recording, using a ‘V’ for the 
vowels. The words are: 

lull pear reap throws think misjudges churches 

You may well have noticed that the nasal stop in think is 
velar, rather than alveolar. It requires considerable 
conscious effort to utter that nasal stop as alveolar, and 
when one does so, the resulting pronunciation sounds 
quite unnatural. This appears to be the result of a process 
of anticipatory assimilation: the tongue adopts the 
articulatory position for the velar stop [k] during the 
pronunciation of the nasal. 

But what about the nasal-plus-velar stop sequence in 
incorrect ? Many speakers of English find it easier to utter 
an alveolar, rather than a velar, nasal there, despite the 
fact that cases like incorrect also contain a sequence of a 
nasal stop followed by a velar stop. Do you have any 
hunches as to why the two cases should be different? 
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English Phonetics: Vowels (i) 

3.1 The Primary Cardinal 
Vowels 

Let us begin by assuming that all vowels are voiced and are 
articulated with a constriction of open approximation. We will 
also assume, for the moment, that all vowels are oral sounds 
(i.e. that the velum is raised during their production). The 
range of positions which the tongue can occupy within the oral 
cavity while remaining in a constriction of open approximation 
is quite large. Let us call the entire available space for such 
articulations the vowel space. We will require a means of 
plotting the point at which a given vowel is articulated in the 
vowel space. In order to do this, we will appeal to an idealized 
chart of that space, as follows (this chart is repeated in the IPA 
chart in figure 2 ): 

(1) The vowel space and the primary cardinal vowels 
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<= front/back =5 



In this diagram, we represent the vowel space along two 
dimensions. The first is the high/low dimension (also referred 
to as the close/open dimension), depicting the height of the 
body of the tongue during the articulation of a vowel (i.e. 
depicting vowel height). This is represented as the vertical axis 
in the diagram. The second is the front/back dimension, 
depicting the extent to which the body of the tongue lies 
towards the front of the vowel space. This is represented as 
the horizontal axis in the diagram. We may identify three 
arbitrary points along this dimension: front, central and back. 
In using these two dimensions, we can say, for any given 
vowel, how high in the vowel space it is articulated, and 
whether it is a front, central or back vowel. To these two 
descriptive parameters, we will add a third, which refers to lip 
position: we will say, for a given vowel, whether, during its 
articulation, the lips are rounded or not. We will refer to the 
former sort of vowel as a rounded vowel and the latter as an 
unrounded vowel. 

It is convenient to identify several points along the perimeter 
of the vowel space. Once we have done this, we can plot the 
location of any given vowel in relation to those points. Vowels 
articulated at those points are called the cardinal vowels. We 
will now identify eight of them. 
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Let us begin with the vowel which is produced when the lips 
are unrounded and the tongue is located as high as possible and 
as front as possible, without causing friction, in the vowel 
space. This is cardinal vowel no. 1, depicted at the top left- 
hand corner of the diagram in (1) above. That vowel is 
transcribed as [i]. Using our three descriptive parameters, we 
will refer to this as a high front unrounded vowel. We will not 
seek to exemplify cardinal vowels with words from English, or 
any other language, since, typically, speakers do not utter 
vowel sounds which are quite as peripheral in the vowel space 
as the cardinal vowels. Rather, we will plot the place of 
articulation of English vowels in relation to the cardinal 
vowels, using the vowel space diagram as a map of the vowel 
space. The vowel in many English speakers’ pronunciation of 
the word peep, for instance, is quite close to cardinal vowel no. 
1 : it too is a high front rounded vowel, but it is not quite as 
peripheral as cardinal vowel no. 1: it is typically slightly less 
high and slightly less front in its articulation. 

Let us now identify the cardinal vowel which lies at the 
‘opposite end’ of the vowel space: the vowel which is 
produced when the lips are unrounded and the body of the 
tongue is as low as possible and as far back as possible, 
without causing friction. This is cardinal vowel no. 5. Its 
location is depicted at the bottom right-hand corner of the 
diagram in (1) above. Transcribed as [a], it is a low back 
unrounded vowel. 

We have now identified two ‘anchor’ points in the vowel 
space; we may now proceed to identify further cardinal vowels 
in relation to these. If the lips remain unrounded and the body 
of the tongue remains as low as possible in the vowel space (as 
for cardinal vowel no. 5), but the tongue is moved as far to the 
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front of that space as is possible without causing friction, then 
cardinal vowel no. 4 is produced. It is transcribed as [a]. 

We have now identified two vowel heights: high and low. 
You should be able to feel this difference in tongue height if 
you utter cardinal vowel no. 1 followed by cardinal vowel no. 
4: the jaw opens considerably and the body of the tongue 
lowers considerably as one moves from the former to the 
latter. There is a continuum of vowel heights between these 
two heights; we will identify two arbitrary points along this 
continuum: high-mid and low-mid. If the lips remain 
unrounded and the body of the tongue remains as far front as 
is possible, but the tongue height is lowered somewhat from 
the cardinal vowel no. 1 position, one arrives at the front, 
high-mid unrounded vowel known as cardinal vowel no. 2. 
This is transcribed as [e]. 

In retaining the same lip position and the same degree of 
frontness, one may lower the body of the tongue further still to 
the low-mid position, and arrive at the front low-mid 
unrounded vowel known as cardinal vowel no. 3. This is 
transcribed as [e]. 

If you articulate cardinal vowel no. 1, then cardinal vowels 
nos. 2, 3 and 4, you should feel the body of the tongue 
lowering progressively. These are all front unrounded vowels: 
the difference between them lies in the height of the tongue. 

Let us now consider the back cardinal vowels. If the body of 
the tongue is as high as possible and as far back as possible 
without causing friction, and the lips are, this time, rounded, 
then cardinal vowel no. 8 is produced. This high back 
rounded vowel is transcribed as [u]. 

If the lips remain rounded and the tongue remains as far 
back as possible, but the tongue height is lowered to the high- 
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mid position, cardinal vowel no. 7 is produced. This high- 
mid back rounded vowel is transcribed as [o]. 

In retaining the same degree of backness and the same lip 
position, one may lower the height of the tongue still further, to 
the low-mid position, and thus produce the low-mid back 
rounded vowel known as cardinal vowel no. 6. This is 
transcribed as [o]. 

You should be able to feel the tongue lowering progressively 
as you make the transition from cardinal vowel no. 8, through 
cardinal vowel no. 7, to cardinal vowel no. 6; the tongue goes 
through the same lowering process as it does for the transition 
from cardinal vowel no. 1, through no. 2, to no. 3. 

We have now identified the eight primary cardinal vowels. 
With these reference points established, we may describe the 
articulation of specific English vowels in relation to them. Let 
us begin by looking at those referred to as the English short 
vowels. 


3.2 RP and GA Short Vowels 

There is considerable variation in the vowel sounds uttered by 
speakers of different accents of English, and we will be 
considering that variation in later chapters. For the moment, we 
will begin with two particular accents; we will later describe 
others. We will, somewhat arbitrarily, begin with the accents 
known as Received Pronunciation (RP) and General 
American (GA). RP is the accent often referred to as the 
‘prestige’ accent in British society and associated with the 
speech of the graduates of the English public schools. It is thus 
defined largely in terms of the social class of its speakers. We 
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do not select it as one of our starting points for that reason; 
rather, we select it as it tends to be the accent which foreign 
learners of British English are taught, and has thus been widely 
described. GA tends to be defined in terms of the geographical 
location, rather than the social class, of its speakers. The term 
‘GA’ is an idealization over a group of accents whose speakers 
inhabit a vast proportion of the United States: it excludes 
Eastern accents such as the New York City accent, and 
Southern accents (such as that spoken in Texas). 

It has often been pointed out that terms such as ‘RP’ and 
‘GA’ entail a great deal of idealization, in that they are used to 
cover a variety of somewhat different, if converging, accents. 
We accept this as inevitable: it will be true of any term we use 
to describe an accent (e.g. ‘New York City’, ‘Cockney’, 
‘Scouse’, ‘Geordie’, ‘South African’, etc.) and indeed it is true 
of the term ‘accent’ itself. But we need some way of 
expressing valid generalizations about the speech sounds which 
members of different speech communities utter. For instance, it 
is generally true that, while RP speakers pronounce put and 
putt differently, many speakers with accents found in the North 
of England do not. To refuse to speak of different accents 
would be to throw the baby out with the bathwater, and to 
deny ourselves the opportunity of expressing statements which 
are informative, if subject to certain caveats. 

We have said nothing, as yet, about the length of vowels. For 
speakers of RP and GA, the vowels in peep and pip differ in 
several respects, one of which is vowel length. If you are an 
RP or a GA speaker, and you utter the two words, you will 
probably agree that the vowel in the former is longer than that 
in the latter. We will, accordingly, refer to the former as a long 
vowel and the latter as a short vowel, \bwel length is a relative 
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matter: when we say that the vowel in pip is a short vowel, we 
are not referring to its duration in milliseconds; rather, we are 
saying that it is short in relation to other vowels, such as that in 
peep. The vowel in pip is typically articulated with the body of 
the tongue fairly front and fairly high, and with the lips 
unrounded. We will transcribe that vowel as [i]. While it is a 
high front unrounded vowel, it is less high and less front than 
the vowel in peep. Its location is depicted in (2) below. 

Now consider the vowel in RP and GA speakers’ 
pronunciation of the vowel in the word put. This is, for many 
speakers, a high back rounded vowel, articulated in the region 
near to cardinal vowel no. 8. It is similar to the vowel in 
school, but less high and less back. It is also shorter than that 
in school. We will transcribe this short vowel as [u]; its location 
is depicted in (2) below. 

For RP and GA speakers, there is a distinction between the 
vowel in put and that in putt. Both are short vowels, but they 
differ in several respects. Firstly, the latter vowel is unrounded. 
Secondly, the vowel in putt is articulated with a fairly low 
tongue height: typically, it is just below the low-mid position. 
Thirdly, the vowel in putt is located at around the half-way 
point on the front/back axis. We will refer to vowels located in 
this region as central vowels. We will transcribe this vowel as 
[a]; its location is depicted in (2) below. 

In both RP and GA, the vowels in aunt and ant differ. Both 
vowels are unrounded, but the vowel in ant is shorter than that 
in aunt, and the vowel in ant is a low front vowel, whereas that 
in aunt is a low back vowel. The low front unrounded vowel in 
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ant is articulated higher and less front than cardinal vowel no. 
4. We will transcribe this as [as]; its location is depicted in (2) 
below (although the GA vowel is higher than the RP vowel, 
and sounds rather [s]-like to British speakers). 

The short vowel in RP and GA speakers’ pronunciation of 
the word bet is a front unrounded vowel, whose height is 
somewhere between cardinal vowels nos. 2 and 3. For most 
RP and GA speakers, it is closer to cardinal vowel no. 3 than to 
cardinal vowel no. 2 in height; it is also somewhat more 
centralized than cardinal vowel no. 3. For convenience’ sake, 
we will transcribe it as [e]; its location is depicted in (2) below. 

The short vowel in the RP speaker’s pronunciation of the 
word pot is a back rounded vowel which is articulated with a 
tongue height somewhere between low and lowmid (i.e. 
between cardinal vowels nos. 5 and 6). It is transcribed as [d]; 
its location is given in (2) below. This vowel is absent from the 
GA system: GA speakers have the vowel [a] in words such as 
pot. [a] is a short back rounded low vowel. 

(2) RP and GA short vowels 



We have used the words pit, pet, pat, pot, putt and put to 
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illustrate these vowels, since these words differ in 
pronunciation only with respect to the vowel. In discussing 
vowels, we will also adopt the lexical sets adopted by Wells 
(1982; see Suggested Further Reading). These are key words 
selected by Wells to bring out the similarities and differences 
between RP and GA. We will therefore, at times, refer to the 
vowel in words such as pit as the KIT vowel. The vowel in 
words such as pet we will call the DRESS vowel; words such 
as pat have the TRAP vowel; words such as pot have the LOT 
vowel; words such as put have the FOOT vowel, and words 
such as putt have the STRUT vowel. 

There is one further vowel sound, indicated above, which 
we must consider at this stage. It is the first vowel sound which 
occurs in most speakers’ pronunciation of the word about. 
This vowel is referred to as schwa; it is produced without lip 
rounding, and with the body of the tongue lying in the most 
central part of the vowel space, between high-mid and low- 
mid, and between back and front. Schwa is transcribed as [o]. 
This vowel is typically even shorter than the short vowels we 
have just described, and it differs from those in that it may 
never occur in a stressed syllable (in about, it occurs in the 
unstressed first syllable; in elephant, it occurs in the unstressed 
second syllable; in Belinda, it occurs in the unstressed initial 
and final syllables). This vowel occurs in the speech of almost 
every speaker of English; in later chapters, we will consider its 
relation to English stressed vowels in more detail. 
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Exercises 

1 Describe the position of the body of the tongue and the 
lips in the production of the following vowels: 

[i] (cardinal vowel no. 1) 

[u] (cardinal vowel no. 8) 

[a] (cardinal vowel no. 5) 

2 Give an appropriate vowel symbol for the vowel in each 
of the following words, as you would utter them. Say (a) 
whether the vowel is rounded or not, (b) how back or 
front it is, and (c) how low or high it is (do this in relation 
to the cardinal vowels): 

pit apt stock bet put putt 

Note. If you are discussing these exercises in a tutorial 
group, you may well already have begun to notice 
differences in the speech of the members of the group, 
depending on the accents they speak. Clearly, there is little 
point, if one has, say, a West Yorkshire or a New York 
City accent, in transcribing these words as if one were an 
RP or a GA speaker. What you should do is to try to work 
out (preferably with the help of a tutor) what the quality 
of each vowel is, and to adopt an appropriate phonetic 
symbol for that vowel, which you can then use 
consistently in your transcriptions. In due course, we will 
be examining accent variation in more detail. 



Listen to sound 
files online 

3 Listen to Track _3J. at 

www. wilev. com/ go/carrphonctics l. Give a phonetic 
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transcription, with as much phonetic detail as possible, for 
each of the words you hear: 
elephant 

throb suspicious unbalanced encourage 
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English Phonetics: Vowels (ii) 
4.1 RP and GA Long Vowels 

We noted that the RP/GA vowel in put ([u]) is shorter than that 
in school ; we also said that it is less back and less high than 
that in school. We will transcribe the vowel in school as [u:], 
where the ‘ : ’ diacritic denotes vowel length. This is a high 
back rounded vowel, articulated closer to cardinal vowel 8 than 
[u]. 

The RP/GA short vowel [i], as in fit, which we described in 
chapter 3, is a fairly high, fairly front, unrounded vowel. It 
differs from the RP/GA vowel in feet, which is longer, more 
front and higher. We will transcribe this as [i:]; it is a high front 
unrounded vowel which is closer to cardinal vowel 1 than [i]. 

It is worth noting that in RP and GA, when words such as to 
and (s)he are uttered in isolation, they contain, respectively, the 
vowels [u:] and [i:], so that to is pronounced in the same way 
as two and too. But ‘function’ words like to and (s)he (which 
are not nouns, adjectives or verbs) are often uttered without 
stress, in which case they may be uttered with a schwa ([a]), 
or in a shortened form, as in to eat (pronounced either as [toi:t] 
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or as [tui:t]) and she wore (pronounced either [fowo] or 
[fiwo:]). The shortened form of [i:] is also found in various 
suffixes, as in the suffix in witty: [witi] and in the suffix in 
quickly: [k h wikli]. It occurs too in the unstressed syllable of 
words such as pretty: [p h uti]. ■ 

The RP vowel in port and caught is longer than that in pot 
and cot ; it is a low-mid back rounded vowel, articulated closer 
to cardinal vowel 6 than is the [n] in RP pot and cot. We will 
transcribe it as [o:]. This is also the vowel which GA speakers 
utter in words like caught (although the GA vowel is somewhat 
shorter than the RP vowel). Thus, although both GA and RP 
speakers distinguish between pairs such as cot and caught, GA 
has [a] in cot whereas RP speakers have [o]. In GA, words 
such as horse and port, with an [i] after the vowel, are 
typically uttered as [hois] and (see below on /ou/ in RP and 
GA). 

The RP and GA short vowel [as], as in ant, is, as we have 
seen, a fairly low, rather front, unrounded vowel. It differs 
from that in aunt, which is a low back unrounded vowel, 
articulated in the region of cardinal vowel 5. The RP/GA vowel 
in aunt is also longer than the RP/GA vowel in ant. We will 
transcribe it as [a:]. Thus, whereas RP has a three-way 
distinction between [n], [a:] and [o:], GA has only a two-way 
distinction between [a] and [o:]. We will return to this 
difference between the accents below. 

RP and GA speakers utter a long vowel in words like bird, 
heard, dearth, although GA speakers utter an [j] in words such 
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as these, while RP speakers do not. The articulation for this 
vowel is pretty much the same as that for schwa: it is central 
on both the high/low and front/back dimensions, and is 
unrounded. Unlike schwa, it appears in stressed syllables. We 
will transcribe it as [3:]. 

We may depict the approximate areas of articulation of these 
vowels in the vowel space as follows: 

(1) RP and GAlong vowels 



We will, following Wells (1982; see Suggested Further 
Reading), refer to [i:] as the FLEECE vowel, [u:] as the 
GOOSE vowel and [ 3 :] as the NURSE vowel. Wells uses three 
key words for the [o:] vowel: THOUGHT, FORCE and 
NORTH; we will see why at a later stage. Similarly, Wells uses 
three key words for the [a:] vowel: START, BATH and PALM. 
One of the reasons for this is that in words of the set BATH, 
GAhas [se], whereas RP has [a:], whereas in words of the sets 
START and PALM, both GAand RP have [a:]. 


4.2 RP and GA Diphthongs 
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In all of the RP and GA vowel sounds we have considered thus 
far, the articulators remain more or less in the same position 
throughout the articulation of the vowel. This means that the 
vowel quality (the acoustic effect created during the articulation 
of the vowel) remains more or less constant. That kind of 
vowel is a monophthong. However, there are vowel sounds in 
which this is not the case. This kind of vowel sound, called a 
diphthong, entails some kind of change of position of the 
articulators during its production, and thus a change in the 
vowel quality produced. A diphthong is a vowel whose quality 
changes within a syllable. A diphthong is not simply a 
sequence of two vowels. For instance, in both the RP and the 
GA pronunciations of the word seeing ([siarj]), the vowel [i:] is 
followed by the vowel [i], but the resulting sequence is not a 
diphthong, because the [i:] and the [i] are not in the same 
syllable: seeing has two syllables, the first of which ends in [i:] 
and the second of which begins with [i]. 

Let us begin with diphthongs which end in an [i]-like quality. 
In the RP and GA pronunciations of words such as sigh, rye, 
bide, etc., the vowel begins with an [a]-like quality (in the 
region of cardinal vowel 4) and ends in an [i]-like quality. We 
will transcribe this as [ai]. 

In the RP and GA pronunciations of say, ray, bayed, etc., 
the vowel begins with an [e]-like quality (in the region of 
cardinal vowel 2) and ends in an [i]-like quality. We will 
transcribe this as [ei]. In words such as hair, the GA 
pronunciation is a monophthongal [e], followed by an [i]. 

In the RP and GA pronunciations of soy, Roy, buoyed, etc., 
the vowel begins with an [o]-like quality (in the region of 
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cardinal vowel 6) and ends in an [i]-like quality. We will 
transcribe this as [oi]. 

We may represent these diphthongs in the vowel space diagram 
as follows: 

(2) RP and GA diphthongs ending in [i] 



Wells uses the key word FACE for the [ei] diphthong, 
CHOICE for the [oi] vowel and PRICE for the [ai] vowel; we 
will follow this practice when it proves useful. 

There are two diphthongs in RP and GA which end in an 
[u]-like quality. The first of these begins with a low, rather 
back, unrounded quality. It is found in the RP and GA 
pronunciations of words such as how, now, loud. We will 
transcribe this diphthong as [ao]. 

The second of these diphthongs begins, among GA speakers, 
and among more conservative RP speakers, with an [o]-like 
quality. It occurs in words such as sew, roe, toad. We will 
transcribe this as [ou]. Among more modern RP speakers, 
words such as these are pronounced with an [oo]-like quality. 2 
Words such as sport, uttered with the long vowel [o:] but no [i] 
in RP, are uttered with an [o] followed by an [j] in GA. 

These two diphthongs may be represented within the vowel 
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space as follows: 

(3) RP and GA diphthongs ending in [u] 



For these diphthongs, Wells uses the key word MOUTH for 
the [au] vowel and GOAT for the [ou] diphthong. 

Many RP speakers utter a series of diphthongs which end in 
an [o]-like quality, i.e. schwa. Since schwa is pronounced in 
the centre of the vowel space, these are often called centring 
diphthongs. The first of these diphthongs begins with an [i]- 
like quality. It occurs in words such as here and pier. We will 
transcribe this as [io]. 

Another diphthong of this sort begins with an [e]-like quality 
(in the region of cardinal vowel 3). This occurs in the RP 
pronunciation of words such as hair and pear. We will 
transcribe this as [so]. Some RP speakers pronounce words of 
this sort with [e:], a long vowel which is not a diphthong at all, 
but is more like a long version of a vowel in the region of 
cardinal vowel 3. 

A third such diphthong begins with an [u]-like quality, and 
occurs in words such as tour and pure. We will transcribe this 
as [oo]. Some RP speakers pronounce some of these words 
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(e.g. moor) as a long monophthong in the region of cardinal 
vowel 6. If you encounter this, you may reasonably transcribe 
it as [o:]. 

These three diphthongs may be represented as follows: 

(4) RP diphthongs ending in [o] (centring diphthongs) 



These are called centring diphthongs since schwa is located 
at the centre of the vowel space. Wells uses the key words 
NEAR for the [io] vowel, CURE for the [oo] vowel, and 
SQUARE for the [so] vowel. These diphthongs are all absent 
in GA. Their presence in RP results from the loss of [i] after 
vowels in the historical development of RP: the schwa is, as it 
were, the only remaining trace of the [i] which once existed in 
the accents from which RP evolved, in the pronunciation of 
words such as here, hair and pure, which are pronounced 
[hi:u], [her] and [p h ua] in GA. In RP, it is common to find a 
monophthongal variant. For the SQUARE vowel in 
contemporary RP, it is common to find a long monophthong: 
[e:]. For the CURE vowel, many words of that lexical set are 
now pronounced by RP speakers with the long monophthong 
[o:], as in the word sure , pronounced [fb:]. 


7 6 





Notes 

1 Accents of English vary with respect to the final vowel in 
words such as quickly, witty and pretty. Some have an [i]- 
type vowel, wh ile others have an [i]-type vowel. The former 
vowel is sometimes said to be ‘more tense’ than the latter, 
and accents with [i] in these words are sometimes described 
as having ‘‘HAPPY tensing’. The latter term is due to Wells 
(1982; see Suggested Further Reading). 

2 Among younger speakers of RP, these diphthongs are 
frequently uttered with a fronted unrounded second element; 
we could transcribe these pronunciations as [oi] and [ai], 
where the second symbol denotes a relatively high central 
unrounded vowel, as in coke : [k h oik] and down: [dam]. The 
effect is to make coke sound rather like cake, and down 
rather like dine. There are many other such pairs; the 
principal point about them is that the pairs are still distinct, 
but less markedly so than in the past. 


Exercises 

1 Transcribe phonetically the vowel which you utter in 
each of the following words: 

caught court cot blew put dearth death feel fi 
(See the note under chapter 3, exercise 2.) 

2 For native speakers of English: the vowels in the 
following words are normally diphthongs in RP. For each 
word, phonetically transcribe the vowel as you would 
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normally say it, with the appropriate symbol. If it is a 
diphthong in your speech, describe the initial and final lip 
and tongue configurations. If it is not a diphthong, say 
how back/front it is, how high/low and whether it is 
rounded. 


fear fair tour late sighed side join toad towed 
(Again, see the note under chapter 3, exercise 2.) 



Listen to sound 
files online 

3 Listen to Track 4.1 at www.wilev.com/go/carrphonetics . 
Transcribe, with as much phonetic detail as possible, each 
of the words you hear: 

carted concluded divine divinity serene serenity 
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5 


The Phonemic Principle 

5.1 Introduction: Linguistic 
Knowledge 

We have been dealing, thus far, with phonetics, that is (as we 
have defined it), with the study of human speech sounds 
(although we have dealt exclusively with English phonetics, 
and in particular, exclusively articulatory phonetics, ignoring 
important facts about the acoustic properties of the speech 
sounds we have been discussing). We will, henceforth, be 
dealing with phonology, as well as phonetics. Phonology, we 
will claim, is to do with something more than properties of 
human speech sounds per se. Phonology is the study of certain 
sorts of mental organization. In particular, it is the study of 
certain types of mental category, mentally stored 
representations, and generalizations concerning those categories 
and representations. On this view, phonology is not the study 
of human speech sounds per se, although phonetics and 
phonology are inextricably intertwined. The point of this 
chapter is to demonstrate what the difference between the two 
is, and to begin to introduce the reader to the phonology of 
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English. Let us begin by considering some general questions 
concerning what it is to know a language. 

Let us assume that when we say that someone knows a 
language, in the sense of being a native speaker of that 
language, he or she is in a certain mental state, or possesses a 
certain sort of linguistic knowledge. Knowledge of a native 
language is, apparently, largely unconscious knowledge. It 
appears to contain semantic knowledge (to do with the 
meanings of words, phrases and sentences) and syntactic 
knowledge (to do with the syntactic categories of words, with 
the structure of phrases and sentences and with the syntactic 
relations between words, phrases and clauses). We know that 
this is so, since speakers are able to make syntactic and 
semantic judgements, based on that knowledge. Lor instance, a 
native speaker of English can judge that Who did you see 
Graham with? is an English sentence, and that Who did you 
see Graham and? is not. The speaker knows, again intuitively, 
that the difference between the two amounts to more than the 
difference between the mere presence of the word and as 
opposed to the presence of the word with. He or she also 
knows intuitively (not necessarily fully consciously) in what 
sense He told the man who he knew is ambiguous, and in what 
sense the two interpretations of that sequence of words differs 
in structure and meaning from He told the man how he knew, 
over and above the superficial fact that one sequence contains 
who and the other how. That knowledge is clearly unconscious 
knowledge, since we require no instruction to be able to make 
such judgements, and we can make them in the absence of any 
conscious knowledge whatsoever of the syntax and semantics 
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of English (one could make such judgements even if one had 
not the faintest idea of what a noun or a verb might be, or 
what the syntactic categories of with, and, who and how might 
be). 

We will take the view in this book that a speaker’s (largely) 
unconscious knowledge of his or her native language(s) must 
also contain phonological knowledge. One of the reasons many 
linguists take this view is that speakers can make judgements 
which, it is claimed, are in some sense parallel to those made 
with respect to syntactic states of affairs. For instance, a native 
speaker of English can tell how many syllables there are in a 
word without having the faintest idea, consciously, as to what a 
syllable is. This shows that the native speaker has the ability to 
recognize syllables, even if the recognition of syllables lies 
below the level of consciousness. In a similar fashion, it is 
claimed, a native speaker of English can tell that the sequence 
of segments [bkg], considered as an utterance of a word, is an 
English sequence, whereas the sequence of segments [t h Lvg] is 
not, despite the fact that she or he may well never have heard 
either sequence in her or his life. Let us postulate that, in 
making such judgements, the native speaker of English gains 
access to a kind of unconscious knowledge which constitutes 
‘the phonology of English’. 

Our task, in this book, will be to begin to consider, in an 
elementary way, what form that knowledge takes. The 
discipline of phonology, under this view, differs from that of 
phonetics, since it is the study, not of speech sounds per se, 
but of mental abilities and largely unconscious mental states. 
Clearly, the phonologist must pay close attention to speech 
sounds and their properties; they will constitute much of the 
evidence the phonologist brings to bear on his or her 
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hypotheses about speakers’ unconscious phonological 
knowledge, but they do not constitute his or her object of 
inquiry as such. 

5.2 Contrast vs Predictability: 
The Phoneme 

Let us begin by considering voiceless unaspirated and voiceless 
aspirated stops in English and Korean. Speakers of most 
accents of English habitually utter both aspirated and 
unaspirated voiceless stops. The following English data exhibit 
both of these.- 

(1) Aspirated and unaspirated voiceless stops in English 

(a) 

[ p h u:l] 
pool 

(b) 

[o p h io] 
appear 

(c) 

[' sp3:t] 
spurt 

(d) 

[do' spait] 
despite 

(e) 

[' t h Dp] 
top 

(f) 
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[o't h aek] 

attack 

(g) 

[' stop] 
stop 

(h) 

[do' stjoi] 
destroy 

(i) 

[ k h ilig] 
killing 
0 ) 

[o'k h iu:] 

accrue 

(k) 

[' skoold] 
scold 

( l ) 

[di' skAvo] 
discover 

The diacritic which precedes certain symbols in these data 
(the one which precedes the ‘p’ symbol in [ p h u:l] indicates the 
beginning of a stressed syllable. 

From these data, it appears that voiceless stops are aspirated 
when they are at the beginning of a stressed syllable, as in pool 
and appear, but unaspirated when preceded by a voiceless 
alveolar fricative, as in spurt. That is, in these data, wherever 
the unaspirated voiceless stops appear, the aspirated ones do 
not, and vice versa. Compare the English data with the 
following data from Korean: 
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(2) Aspirated and unaspirated voiceless stops in Korean 

(a) 

[P h ul] 

‘grass’ 

(b) 

[pul] 

‘fire’ 

(c) 

[t h al] 

‘mask’ 

(d) 

[tal] 

‘moon’ 

(e) 

[k h sda] 

‘dig’ 

(f) 

[ksda] 

‘fold’ 

In these Korean data, aspirated and unaspirated voiceless 
stops may occur in the same place (at the beginning of a 
word). The range of places within a word which a given sound 
may occur in is called its distribution. In the English data we 
have looked at, the distribution of unaspirated and aspirated 
stops is mutually exclusive : where you get one kind of stop, 
you never get the other. This is called complementary 
distribution. 

Furthermore, if we take, say, the stops [t] and [t h ] in the 
English data, it is clear that they are phonetically similar: both 
are stops, both are voiceless, both are alveolar. And yet, for 
most speakers of English, the alveolar stops in, say, still and 
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till sound the same, despite the fact that the former is 
unaspirated and the latter aspirated. For the English speaker, 
these two phonetically distinct sounds ‘count as the same 
thing’. We cannot say, without contradiction, that they are 
simultaneously ‘the same sound’ and ‘not the same sound’. 
What we will say is that, while they are phonetically distinct, 
they are phonologically equivalent. That is, the two types of 
stop correspond to, are interpreted as belonging to, a single 
mental category. We will refer to such a category as a 
phoneme. The English speaker interprets the six phonetic 
segments [p], [p h ], [t], [t h ], [k] and [k h ] in terms of only three 
phonemes: /p/, /t/ and /k/. We may depict this as follows: 

(3) English voiceless stop phonemes 





[Pi [Pi [t] [t h | [k] [k h ] 


The top line here represents the three voiceless stop 
phonemes (mental categories) in terms of which the six types 
of phonetic segment are perceived. The relationship between 
phonemes and their associated phonetic segments is one of 
realization, so that the phoneme /p/, for instance, is realized 
as [p] after a voiceless alveolar fricative, and as [p h ] elsewhere. 
The most important point is that, on the data we have seen 
thus far, aspiration or the lack of it is entirely predictable in 
English: there is a generalization, expressible as a general rule, 
as to the contexts in which voiceless stops will and will not be 
aspirated. For most accents of English, this generalization is 
one that is internalized by children when they acquire English 
as their native language. The generalization forms part of what 
native speakers know in knowing their native language, even if 
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that knowledge is largely unconscious knowledge. Realizations 
of a phoneme which are entirely predictable from context are 
called its allophones. We therefore say that [p] and [p h ] are 
allophones of the /p/ phoneme in most accents of English. We 
are claiming that native speakers of English possess phonemes 
(which are mental categories) and phonological generalizations 
or rules as part of their (largely unconscious) knowledge of 
their native language, and that native speakers perceive the 
allophones they hear in terms of those categories and 
generalizations. 

Compare the English situation with the Korean one. It is 
clear that the distribution of aspirated and unaspirated voiceless 
stops in Korean is overlapping: there is at least one place (at 
the beginning of words) in which either type of sound may 
occur. This kind of distribution is referred to as parallel 
distribution, where ‘parallel’ means ‘overlapping to some 
degree’. 

Furthermore, the distinction between aspirated and 
unaspirated voiceless stops can make a crucial difference in 
Korean: when the Korean speaker says [p h ul], it does not mean 
the same thing as [pul]. The difference between the two 
sounds is said to be semantically contrastive. Pairs of words 
which differ with respect to only one sound are called minimal 
pairs. Their existence is important, since they demonstrate that 
the two sounds in question are both in parallel distribution and 
semantically contrastive. 

We therefore want to say that, unlike the English speaker, 
the Korean perceives the six aspirated and unaspirated 
voiceless stops [p], [p h ], [t], [t h ], [k] and [k h ] in terms of six 
different mental categories. That is, [p], for instance, is a 
realization of the /p/ phoneme, whereas [p h ] is a realization of a 


86 



distinct /p h / phoneme. We may depict (part of)- the Korean 
system thus: 

(4) Some Korean voiceless stop phonemes 

Ip/ lp h l It/ If/ Ikl lk h f 

[P] [P h l It] [f 1 ] [k] [k h ] 

The distinction between aspirated and unaspirated voiceless 
stops is phonemic in Korean but allophonic in English. Both 
English and Korean speakers habitually utter both aspirated 
and unaspirated voiceless stops. On the phonetic level, the two 
languages are therefore equivalent as far as bilabial, alveolar 
and velar voiceless stops are concerned. But at the phonemic 
level (the mental level), the two languages are quite distinct: the 
Korean speaker has six mental categories where the English 
speaker has only three. As far as voiceless stops are 
concerned, Korean speakers have twice as many phonemic 
contrasts as English speakers. The difficulty which the English 
speaker encounters in learning to pronounce and perceive 
Korean voiceless stops is therefore a mental one; it is a 
phonological difficulty, not a purely articulatory one. 

This is not to deny that there can be purely articulatory 
difficulties in learning to speak another language (difficulties in 
articulating new types of sound which one is not in the habit of 
articulating). For instance, most speakers of Japanese who are 
learning to speak English will have to leam to pronounce the 
sound [1], which they are not in the habit of pronouncing. 
When learners of a foreign language face this task, they often 
utter a sound from their native language which is similar to the 
target sound: in this case, the tap [r] which, like [1], is voiced 
and alveolar. Similarly, a speaker of French who is trying to 
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master the English sound [6] will often utter the voiced alveolar 
fricative [z] or the voiced dental stop [d], which she or he is 
used to uttering in her or his native language. The former is 
similar to the target sound in being a voiced fricative, while the 
latter is similar in being a voiced dental sound. Such problems 
with the pronunciation of foreign languages are widespread. 
But they are distinct in kind from the kind of problem we have 
just discussed. 

We need not deny either that there may be difficulties in the 
pronunciation of a foreign language which involve both purely 
articulatory and phonological difficulties. For instance, the 
English speaker who is learning Korean must learn to articulate 
a third kind of stop which is distinct from voiced stops, 
aspirated voiceless stops and unaspirated voiceless stops. 
These are the voiceless stops of Korean which are articulated 
with ‘glottal tension’: during their - production, the vocal cords 
do not vibrate, but nor are the vocal cords spread apart, as 
they are for the voiceless aspirated stops; rather, the vocal 
cords are constricted. 

The English speaker must also leam to (in a sense) perceive 
the distinction between all three sorts of stop in Korean; since 
the globally constricted voiceless stops are a new category of 
sound, they may seem to the English speaker to sound like 
stops he or she is more used to hearing (voiced stops, for 
instance). And that is a phonological difficulty, added to the 
purely articulatory one which the English speaker also has. 
However, it is clear from the data we have looked at here that 
there is a type of difficulty which is exclusively phonological, 
and it is that kind of difficulty which justifies our making a 
distinction between the kind of articulatory phonetics discussed 
in the preceding chapters, which constitutes the study of the 
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articulation of speech sounds in and of themselves, and 
phonology, the study of the system of mental categories in 
terms of which we interpret those speech sounds. 

In examining the phonological differences between Korean 
and English voiceless stops, we have adopted what is known as 
the phonemic principle, which consists of two sets of two 
criteria, as follows: 

(5) The phonemic principle 

Two or more sounds are realizations of the same phoneme 
if: 

(a) they are in complementary distribution 
and 

(b) they are phonetically similar. 

Two or more sounds are realizations of different 
phonemes if: 

(a) they are in parallel (overlapping) distribution 
and 

(b) they serve to signal a semantic contrast. 

It is on the basis of the phonemic principle that we say that 
phonetic differences involving aspiration are allophonic in 
English but phonemic in Korean. 

We have just seen a case where the Korean speaker has 
more phonemic contrasts than the English speaker. Let us now 
look at another set of data where the converse is the case. 
Native speakers of some varieties of Scottish English habitually 
utter the speech sounds we have represented as ‘[r]’ and ‘[1]’, 
i.e. the voiced alveolar tap and the voiced lateral alveolar 
approximant (as in rip and lip). So do speakers of Korean. 
Here are some examples of Scottish English and Korean words 
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which contain those sounds: 


(6) [f] and [1] in Scottish English and Korean 



Scottish English 


Korean 


(a) 

[l«m] 

lamb 

(b) 

[mul] 

‘water’ 

(c) 

[item] 

rani 

(d) 

[ mulkama] 

‘place for water’ 

(c) 

Pip] 

lip 

(f) 

[mure] 

‘at the water’ 

(g) 

[up] 

rip 

(h) 

[mal] 

‘horse’ 

(i) 

[t*ri] 

berry 

(i) 

[malkama] 

‘place for horse’ 

(k) 

[beii] 

belly 

(1) 

[mare] 

‘at the horse’ 


While speakers of Scottish English and Korean habitually 
utter both sounds, we can predict that many native speakers of 
Korean who are learning to speak this variety of Scottish 
English would find the distinction between [1] and [r], when 
they speak Scottish English, rather difficult to get the hang of. 
On the face of it, this is puzzling because, as we have just said, 
Korean speakers have no difficulty in uttering the two sounds, 
and may well have uttered many thousands of them, long 
before beginning to learn Scottish English. So wherein does the 
problem reside? One possibility that can be immediately 
discounted is the suggestion that Korean speakers are 
encountering some kind of physical, articulatory difficulty: it is 
clearly not the case, as we have seen, that either of the sounds 
is new to them. 

The difficulty is of a mental nature, and if one examines the 
table of data in (6) above, it is clear that, in Scottish English, 
the two sounds may occur in the same places within a word, 
e.g. at the beginning of words, or between vowels. 
Furthermore, two words may differ solely with respect to the 
segments [r] and [1]: there are minimal pairs involving the two 
sounds ([rsem] vs [lsem], for instance). In this variety of 
Scottish English, [r] and [1] are in parallel distribution and can 
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function to signal a semantic contrast. It is important to bear in 
mind that, when we say that a phonetic difference is 
contrastive, we refer to a semantic contrast, and not to a 
phonetic difference between the sounds. 

In Korean, the distinction between [r] and [1] can never be 
contrastive, since [r] and [1] may never occur in the same 
place. They are in complementary distribution: where one 
occurs, the other never does, and vice versa. Specifically, [r] in 
Korean occurs between vowels but nowhere else, whereas [1] 
never occurs between vowels, but may occur elsewhere. 
Because of this, it is impossible to find minimal pairs involving 
these two sounds in Korean. The two sounds are also 
phonetically similar: both are voiced and both entail a closure 
made between the centre of the tongue blade and the alveolar 
ridge. Therefore the two sounds are realizations of the same 
phoneme in Korean. 

In this variety of Scottish English, there is a phonemic Id vs 
l\l contrast. In Korean, on the other hand, there is no such 
phonemic contrast: whereas this variety of Scottish English has 
Id vs IV, Korean has only one phoneme: IV, which has two 
allophones, [r] and [1], Put another way, the difference 
between the sounds [r] and [1] is phonemic in Scottish English, 
whereas the difference between [r] and [1] is allophonic in 
Korean. Speakers of this variety of English perceive [r] and [1] 
in terms of two distinct mental categories, whereas Korean 
speakers perceive them in terms of a single mental category. In 
Korean, the phoneme IV is realized as [r] between vowels, and 
is realized as [1] elsewhere. 

We may depict this phonological difference between this 
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variety of Scottish English and Korean as follows: 

(7) The phonemic status of [r] and [1] in Scottish English 
and Korean 

Scottish English speakers Korean speakers 
Phonemic units: / / Id III 



Allophonic units: [] [r] [1] [r] 

We have now shown where the Korean speakers’ difficulty 
resides: at the level of their (largely) unconscious knowledge of 
their language. As far as these segments are concerned, Korean 
and this variety of Scottish English do not differ at the 
allophonic level: both have [r] and [1]. But they do differ at the 
phonemic level: the Scottish English speaker has a mental 
distinction which the Korean speaker lacks; the Korean 
speakers’ problem is thus mental (specifically, perceptual) in 
nature, not articulatory. 

We have said that it is entirely predictable which allophone of 
the Korean l\l phoneme will occur in a given context. We may 
say that there is a phonological generalization governing the 
occurrence of the allophones, which the native speakers of 
Korean have unconsciously grasped, and which forms part of 
their linguistic knowledge. We may express that generalization 
in terms of a phonological rule, as follows: 

(8) IV realization in Korean 

/]/ is realized as [r] between vowels. 

As we will see, the linguistic knowledge of native speakers 
contains many such generalizations. As far as [r] and [1] are 
concerned, the phonological knowledge of the Korean speaker 
and that of the Scottish English speaker differ in two respects: 
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(a) the Scottish English speaker has a phonological distinction 
which the Korean speaker lacks, and (b) the Korean speaker 
possesses a phonological generalization which the Scottish 
English speaker lacks. Phonological knowledge therefore 
consists of, among other things, phonological categories and 
phonological generalizations. 

In several varieties of English, the N phoneme also has 
allophones: ‘clear P ([1]) and ‘dark 1’ ([!]).- The following data 
show the typical distribution of these two sounds in those 
varieties: 

(9) English ‘clear P and ‘dark P 

(a) 

[k h levo] 

clever 

(b) 

[bslz] 

bells 

(c) 

[p h lein] 

plain 

(d) 

[tied] 

trail 

(e) 

[luk] 

look 

(f) 

[p h ul] 

pull 

(g) 

[b:] 
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law 

(h) 

[bo:lz] 

balls 

(i) 

[lai] 

lie 

0 ) 

[p h ail] 

pile 


One way of stating the distribution of the allophones is to say 
that ‘clear Y occurs immediately before vowels, whereas ‘dark 
1’ occurs immediately after vowels. We may state the 
relationship between the IV phoneme and its clear and dark 
allophones in terms of the following rule (which we will later 
express in terms of syllable structure): 

(10) /l/ realization in English 

IV is realized as [1] immediately after a vowel. 

We may depict the realizations of Korean IV and IV in certain 
varieties of English as follows: 

(11) IV realizations in Korean and English 


Korean 


English 

III 




[ 1 ] 
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5.3 Phonemes, Allophones 
and Contexts 
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We have said that the allophones of a phoneme are predictable 
realizations of that phoneme. We can predict which allophone 
will occur, given a specific context. The sorts of context we 
have cited are, in some cases, rather general. For instance, in 
the Korean data we considered, we saw that aspirated and 
unaspirated voiceless stops may both occur at the beginning of 
a word. We also saw, in the Korean data that we looked at, 
that Korean IV is realized as [r] between vowels. ‘At the 
beginning of a word’ and ‘between vowels’ are quite general 
contexts. So is ‘at the end of a word’, or ‘before a consonant’, 
or ‘after a vowel’. 

In other cases, the contexts we need to refer to are more 
specific. For instance, in the English data we looked at, we saw 
that the unaspirated voiceless stops occurred after a voiceless 
alveolar fricative. In many cases, there appears to be some 
kind of phonetic connection between the context in which an 
allophone occurs and the nature of the allophone itself. Let us 
consider an example. 

In many accents of English, the hJ phoneme has two 
realizations: [j] and [i] (in which the subscript diacritic denotes 
voicelessness). The following data exemplify this 

(12) Voiced and voiceless allophones of hJ in English 

(a) 

[t'qai] 

try 

(b) 

[ojei] 

array 

(c) 

[p h ju:v] 

prove 
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(d) 

[giou] 

grow 

(e) 

[k'Uciv] 

crave 

(f) 

[bjeik] 

break 

(g) 

m 

free 

(h) 

[durjk] 

drink 

(i) 

three 

0 ) 

[bsejoo] 

barrow 


It is clear that the voiced and voiceless alveolar approximants 
are in complementary distribution: the voiceless one appears 
only after voiceless consonants, and the voiced one appears 
elsewhere. The question is whether we should say that there is 
a voiced alveolar approximant phoneme which is realized as a 
voiceless allophone after voiceless consonants, or that there is 
a voiceless alveolar approximant phoneme which is realized as 
a voiced approximant after voiced consonants and between 
vowels. We choose the former claim, since it is more 
phonetically natural: approximants are normally voiced. 
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Additionally, we can make phonetic sense of the claim that a 
voiced phoneme has a voiceless realization when it follows 
voiceless consonants: the realization is assimilating to the 
preceding segment (it is becoming more like an adjacent 
segment). 

Let us consider another case of this sort. In many accents of 
English, there are stops which are articulated in front of the 
velar place of articulation, close to the hard palate. The 
following data exemplify this ([c] and [j] represent a voiceless 
and a voiced palatal stop, respectively): 

(13) Velar and palatal stops in English 

(a) 

[k h u:l] 

cool 

(b) 

[c h i:p] 

keep 

(c) 

[k h oul] 

coal 

(d) 

[c h i:n] 

keen 

(e) 

[k h np] 

cop 

(f) 

[c h it] 

kit 

(g) 

[k h o:t] 
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cart 

(h) 

[scip] 

skip 

(i) 

[gu:l] 

ghoul 

0 ) 

gear 

(k) 

[gool] 

goal 

( l ) 

P] 

gill 

Once again, the two segment types are in complementary 
distribution: the advanced, palatal articulations occur before 
high front vowels, and the velar ones occur elsewhere. We 
postulate a /k/ phoneme which is ‘fundamentally’ velar in its 
place of articulation, but which has a fronted or advanced 
realization before high front vowels. This makes phonetic 
sense: high front vowels are palatal articulations (the 
articulators are the front of the tongue and the hard palate), so 
we can say that the velar phoneme is assimilating to the 
following vowel when it is a high front vowel. 

We are adopting the view that phonemes often have a kind 
of ‘default’ or ‘basic’ phonetic realization, and that it is this 
realization which will occur in the absence of specifiable 
contexts which ‘shift’ the realization from its default one. 
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5.4 Summing Up 

In this chapter, we have begun to distinguish between 
phonetics, defined as the study of speech sounds per se, and 
phonology, the study of the system of mental representation, 
categories and generalizations to which those sounds are 
related. Native speakers of a language tend to take its 
phonological system for granted. Speakers of English, for 
instance, think it perfectly obvious that [j] and [1] are quite 
distinct, despite the fact that they are, phonetically, very 
similar. Equally, speakers of English cannot easily see that [p] 
and [p h ] are different, despite the fact that they are. This 
chapter has sought to show that what underlies these 
perceptions is the phonological system of the native language, 
as distinct from, if intimately related to, the set of speech 
sounds uttered by native speakers of the language. What 
sounds one takes to be ‘the same’ or ‘different’ depend to a 
large extent on the system of mental categories which 
constitutes one’s native language phonology. But it is clear that 
phonetics and phonology are intimately connected. 

The extent to which our mentally stored system of language- 
specific phonological categories governs our perception of a 
stream of speech sounds was well expressed by the linguist 
Edward Sapir, who worked with North American Indian 
languages in the early twentieth century: 

the unschooled recorder of language, provided he has a good 
ear and a genuine instinct for language, is often at a great 
advantage as compared with the minute phonetician, who is 
apt to be swamped by his mass of observations. I have 
already employed my experience in teaching Indians to write 
their own language for its testing value in another 
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connection. It yields equally valuable evidence here. I found 
that it was difficult or impossible to teach an Indian to make 
phonetic distinctions that did not correspond to ‘points in the 
pattern of his language’, however these differences might 
strike our objective ear, but that subtle, barely audible, 
phonetic differences, if only they hit the ‘points in the 
pattern’, were easily and voluntarily expressed in writing. In 
watching my Nootka interpreter write his language, I often 
had the curious feeling that he was transcribing an ideal flow 
of phonetic elements which he heard, inadequately from a 
purely objective standpoint, as the intention of the actual 
rumble of speech.- 

One can begin to appreciate the extent to which one’s native 
language phonemic categories affect one’s perception when 
one considers that any normal 6-month-old child, no matter 
what language he or she is beginning to acquire, can distinguish 
aspirated and unaspirated voiceless stops. Clearly, then, the 
aspirated/unaspirated difference is one that could in principle 
act as the basis for a phonemic distinction, and it clearly does 
act that way in many human languages. But a child who 
acquires a language (such as most varieties of English) in which 
the aspirated/unaspirated distinction is allophonic rather than 
phonemic will come to ignore that distinction at a certain level 
of awareness. Acquiring the phonology of one’s native 
language can therefore result in a kind of loss of perceptual 
discrimination, but only at one level of awareness: when a 
speaker of, say, South African English utters unaspirated stops 
instead of aspirated stops, this will often be noticed by a 
speaker of, say, RP, even if the RP speaker notices only that 
there is something different about the speech of the South 
African English speaker. Indeed, such differences can be quite 
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striking to the speaker of a language in which unaspirated stops 
never occur word-initially before a stressed vowel. Such 
speakers, on being suddenly confronted by English spoken 
with, say, a Greek accent (on arrival, say, at a Greek airport) 
will typically think that a word such as Gatwick (one of the 
London airports) is being pronounced as [gadwig]. In cases 
such as this, the English speaker not only perceives the fact 
that the stops in question are unaspirated, but also assigns them 
to the category of English voiced stops, because voiced stops 
in English are unaspirated, and word-initial and word-final 
voiced stops in English are barely voiced at all. 

Both the native speaker and the adult learner of English can 
begin to develop an awareness of her or his own phonological 
system, and of the immense influence this has on one’s 
perception of speech sounds, by comparing and contrasting 
languages which are phonetically identical (or nearly identical), 
but phonologically distinct, with respect to some set of sounds. 
The examples given in this chapter are designed to begin to 
induce this kind of awareness, as are the exercises which 
follow. 


Notes 

1 These data do not show the full range of places in which 
aspirated and unaspirated voiceless stops occur in most 
English accents. What we will have to say about their 
phonological status is therefore very much oversimplified. 
But the data will suffice to illustrate a valid point. 

2 Korean has a third phonemic category of stops, which we 
discuss below. 
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3 There are also devoiced allophones of /!/; we ignore these 
here. 

4 We indicate these devoiced sounds here, but henceforth we 
will not transcribe them using the ‘voiceless’ diacritic in cases 
where the devoicing phenomenon is irrelevant to the point 
being made. 

5 Edward Sapir (1921), Language, New York: Harcourt 
Brace, p. 56. 

Exercises 

1 [d] and [6] in English and Spanish 
(a) English 

1. 

[dsn] 

den 

2 . 

[dsn] 
then 

3. 

[douz] 

doze 

4. 

[douz] 

those 

5. 

[dso] 
dare 

6 . 

[dso] 


102 


their 

7. 

[Ada] 

udder 

8 . 

[Ada] 

other 

9. 

[aida] 

Eider 

10 . 

[aida] 

either 

In many (not all) accents of English, [d] and [6] are 
realizations of different phonemes, as these data show. 
They are in parallel distribution (both occur at the 
beginnings of words and between vowels). They also 
function contrastively: there are minimal pairs involving 
the two sounds. We are therefore justified in postulating a 
/d/ vs /5/ phonemic distinction for most accents of 
English. 

(b) Spanish 

Now consider the following Spanish data. (The voiced 
stop in question is in fact dental in Spanish. We overlook 
thi s fact here.) Is the distinction between [d] and [6] 
phonemic or allophonic in Spanish? Justify your answer 
with evidence and argumentation. 

1. 

[der] 

‘to give’ 

2 . 
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[nede] 

‘nothing’ 

3. 

[defter] 

‘to have to’ 

4. 

[bodeye] 

‘wine cellar’ 

5. 

[dies] 

‘days’ 

6 . 

[ebtedo] 

‘spoken’ 

7. 

[bende] 

‘ribbon’ 

8 . 

[predo] 

‘meadow’ 

9. 

[ender] 

‘to go’ 

10 . 

[poder] 

‘to be able’ 

2 \biced stops in English and Korean 
In Korean, /p/, /t/ and /k/ have allophones which are 
unreleased at the end of a word (as can be seen in the data 
below) or before another consonant. The /p/, It/ and /k/ 
phonemes also have voiced stop allophones: [b], [d] and 
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[g], Unlike English, Korean does not have voiced stop 
phonemes: [b], [d] and [g] are always allophones of /p/, /t/ 
and /k/ in Korean. Examine the following data and say 
what contexts the voiced stop allophones occur in: 


(a)(i) 

[pul) 

‘fire’ 

(a)(ii) 

[ibul] 

‘this fire’ 

(b)(i) 

[tall 

‘moon’ 

(b)(ii) 

[idal] 

‘this moon’ 

(c)(i) 

[kan] 

‘liver’ 

(c)(ii) 

[iganl 

‘this liver’ 

(d)(i) 

[papl 

‘cooked rice’ 

(d)(ii) 

[pabi] 

‘cooked rice’ 
(subject) 

(e)(i) 

[taf] 

‘close’ 

(e)(ii) 

[ tadara] 

‘close it!’ 

(f)(i) 

[t/ck-’l 

‘book’ 

(f)(ii) 

[t/egi] 

‘book’ 

(subject) 


3 Glottal stops in English and Standard Arabic 
The phonetic segment [?] (glottal stop) occurs in the 
speech of most speakers of English, but there is no glottal 
stop phoneme (/?/) in English, since [V] never functions 
contrastively with any other segment. For instance, 
[k h s?ol] and [k h stol] ( kettle ) are not pronunciations of 
different words, but different pronunciations of the same 
word. Below are some data from Standard Arabic. Is 
there a glottal stop phoneme in this language? Explain the 
reasoning behind your answer: 

(a) 

[fa?l] 

‘good fortune’ 

(b) 

[fatl] 

‘ twisting / twining’ 

(c) 

[fa?r] 

‘rats’ 
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(d) 

[fair] 

‘it boiled’ 

(e) 

[ba?s] 

‘strength’ 

(f) 

[ba:s] 

‘he kissed’ 

(g) 

[bu?s] 

‘misery’ 

00 

[bu:s] 

‘bus’ 



Listen to sound 
files online 

4 Further phonetic transcription practice 
Listen to Track 5.1 at www.wilev.com/go/carrphonetics . 
Transcribe, with as much phonetic detail as possible, the 
words you hear on the recording, paying attention to 
details such as presence vs absence of aspiration, clear vs 
dark 1, and devoiced allophones of /!/ and 111 . 

started playing price could kill clear creep 
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6 


English Phonemes 

6.1 English Consonant 
Phonemes 

We have distinguished phonemes from phonetic segments, and 
have begun to formulate hypotheses about which phonemes 
might exist as part of the native speaker’s phonological 
knowledge. Specifically, we have said that many English 
speakers have the consonant phonemes /l/, 111 , /p/, It/ and /k/, 
among others. We will shortly postulate a full system of 
consonant phonemes which English speakers have. But we 
must first be a little more precise about what we mean by 
‘English speakers’. Clearly, there are different varieties of 
English, which we will be considering in more detail later, and 
we will need some means of differentiating between them. Let 
us begin by considering a distinction which is often appealed to 
by linguists: that between accent and dialect. It is often said 
that differences in accent concern solely phonetic and 
phonological variation, whereas dialect differences involve 
more than this: they also include differences in vocabulary and 
syntax. This is a rather simplistic way of putting the distinction, 
and it is a distinction which is fraught with difficulties, but it 
will suffice for the present discussion. 
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We may exemplify the difference between accent and dialect 
as follows. Perhaps the most widely spoken (and written) 
English dialect is the ‘prestige’ dialect known as Standard 
English, which has its origins in the South East of England; this 
dialect is used widely, in Britain, in national radio and 
television, in the press, and indeed in most printed publications. 
It is possible to speak Standard English with a New Zealand 
accent, a Tyneside accent, a New York City accent, or indeed 
any accent of English. When this happens, we may say 
(simplifying somewhat) that the speaker is using the vocabulary 
and syntax of Standard English, while retaining the phonetics 
and phonology which constitutes the native accent. 

Let us exemplify the difference between dialect and accent in 
a little more detail, as follows. Take the Standard English 
sentence You will not be able to put the children on the floor. 
Uttered by a speaker with a Standard Scottish English (SSE) 
accent, the outcome would be: 

(1) [johmVbicbltopNVdotJikLionondoflou] 

Now compare this with the same Standard English sentence 
uttered by a speaker of RP. The RP speaker might well utter: 

(2) [jolnnVbiciblt3p h oi ) 6otJ'ildjonon6oflo:] 

Both speakers are speaking Standard English (the syntax is 
the same, as is the vocabulary, if one excludes the phonological 
form of the morphemes in that vocabulary), but their accents 
differ: the SSE speaker’s vowel sounds are not always identical 
to those of the RP speaker, and the SSE speaker utters an [j] 
in floor, which the RP speaker does not. 

Now let us imagine that the same SSE speaker wants to 
convey the same proposition, but speaking, this time, in a 
dialect other than Standard English: that of Lowland Scots. 
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The result might be: 

(3) [j Alno :kimp h i?dobe: unzondefle :aj] 

(This might be written as You’ll no can put the bairns on the 
floor .) 

In (3), the syntax and vocabulary differ from that of 
Standard English; we may say that these are dialectal 
differences, and distinct in kind from the differences in accent 
which we noted between (1) and (2) above. 

We will return to the matter of accent variation in a later 
chapter; for the moment, let us look at the consonant phoneme 
system shared by most varieties of English, which typically 
looks like this: 

(4) English consonant phonemes 

/p/ 

as in pie, pit, rip 

/b/ 

as in buy, bit, rib 

N 

as in tie, tip, writ 

Id/ 

as in die, dip, rid 

/k/ 

as in cool, kit, rick 

/g/ 

as in ghoul, git, rig 

[tj] 

as in chew, chit, rich 

[d 3 ] 
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as in Jew, gin, ridge 

/ 0 / 

as in thigh, thin, with 

16 / 

as in then, that, scythe 

/f7 

as in fie, fit, riff 

M 

as in Venn, vat, leave 

/s/ 

as in sigh, sit, lease 

/z/ 

as in zoo, zip, please 

Ihl 

as in high, hip 

/J 7 

as in .s/iy, ship, leash, mesher 

as in measure 

Iwl 

as in wef, win 

/]/ 

as in Zie, lip, real 

111 

as in rye, rip 

Id 

as in year 

/m/ 
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as in my, meat, rim 

Ini 

as in nigh, neat, sin 

W 

as in sing, ring 

The evidence comes partly in the form of the sorts of 
minimal pair cited in (4), such as measure/mesher 
([ms 3 o]/[msJb]) and sigh/shy ([sai]/[fai]), but we have by no 
means presented all of the evidence here. Let us look briefly at 
some of the evidence for the three nasal stop phonemes 
postulated here: 

(5) Evidence for English nasal stop phonemes 

(a) 

[mi:t] 

meat 

(b) 

[ni:t] 

neat 

(c) 

[moul] 

mole 

(d) 

[nool] 

knoll 

(e) 

[sin] 

sin 

(f) 

[sir)] 

sing 
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(g) 

[dim] 

dim 

(h) 

[dm] 

din 

(i) 

[win] 

win 

0 ) 

[wig] 

wing 

It is clear that [m] and [n] are in parallel distribution: each 
may occur word-initially or word-finally. The distinction is also 
contrastive: it forms the basis for minimal pairs such as 
meat/neat. It is also clear that [n] and [g] are in parallel 
distribution: while [g] does not appear in word-initial position, 
both may occur in word-final position. The distinction is also 
contrastive, as is shown by the existence of minimal pairs such 
as win/wing. The distinction between [m] and [g] is contrastive 
too, as pairs such as whim/wing show. We therefore have clear 
evidence for a three-way phonemic distinction between /m/, /n/ 
and /g/. We will consider this analysis in more detail below. 
The main point to be made at the moment is that we postulate 
the existence of phonemes on the basis of evidence and 
argumentation; if phonemes are mental categories, they cannot 
be directly observed. 
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6.2 The Phonological Form of 
Morphemes 

We have said that, in knowing a language, a speaker possesses 
largely unconscious linguistic knowledge, which subsumes 
semantic, syntactic and phonological knowledge. And we have 
said that the phonological units or categories we have called 
phonemes are part of that phonological knowledge. As we 
progress in this book, we will investigate the question of what 
other sorts of phonological knowledge speakers possess, 
besides phonemes alone. Let us begin this investigation by 
considering the internal structure of words. You will agree that 
the English word cats may be broken down into two 
component parts. Let us call those component parts 
morphemes. Then we may say that this word consists of a 
root morpheme and a plural morpheme (which, in this case, is 
a suffix). Let us say that words of this sort are 
morphologically complex: they consist of more than one 
morpheme. Let us say that a moipheme takes the form of a 
triple: a syntax, a semantics and a phonology. Take the 
moipheme cat : it has a syntax (it is a noun), a semantics (it 
means ‘cat’) and a phonology, which takes the form /kaet/; we 
will refer to this as the phonological form of the moipheme. 
The phonological form of a moipheme may, clearly, consist of 
more than one phoneme. Just as phonemes are mental objects, 
so the phonological form of this moipheme is a mental object: 
/kaet/ is a mental representation in the mind of a speaker, 
whereas the sequence [k h aet] is a phonetic sequence. 

Let us now consider the adjectives impossible , imbalanced, 
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infelicitous, intangible, indirect, insane, incorrect and 
inglorious. All consist of at least a prefix morpheme and a root 
morpheme (some of these words have a prefix, a root and a 
suffix). Many speakers have the following pronunciations of 
these words: 

( 6 ) 

(a) 

[imp h DSibl] 

impossible 

(b) 

[imbaebnst] 

imbalanced 

(c) 

[injfolisitos] 

infelicitous 

(d) 

[int h send 3 ibl] 

intangible 

(e) 

[induskt] 

indirect 

(f) 

[insein] 

insane 

(g) 

[iqkojskt] 

incorrect 

(h) 

[ipgbubs] 

inglorious 

It is part of the native speaker’s unconscious linguistic 
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knowledge of English that these words all have the same 
prefix. That prefix is one of the morphemes of English, and, 
like all morphemes in the language, has a syntax (it is a prefix), 
a semantics (it has a specific meaning) and a phonology. But 
what is the phonological form of that morpheme? We know 
from the data that the suffix may be realized as [im], [irrj], [in] 
or [irj]. It is clear, then, that the first phoneme in the prefix is / 
1 / and the second phoneme is a nasal, but which nasal 
phoneme? We claimed above that English has three nasal 
phonemes: /m/, Inf and /p/. So the phonological form of this 
prefix might be [im], [in] or [ip], Let us consider [ip]. We 
could say that the /p/ phoneme is realized as [n] before /t/, /d/ 
and /s/, and as [m] before [p] and [b]. This seems to make 
sense: we can say that, when the prefix is added to a root, the 
place of articulation of the nasal becomes identical to that of 
the first consonant in the root. Thus, it is alveolar when 
followed by an alveolar consonant (such as /t/, /d/ and /s/), 
labio-dental when followed by a labio-dental consonant (such 
as /£/), and bilabial when followed by a bilabial consonant 
(such as /p/ or /b/). This is the process of assimilation we 
referred to in chapter 2, in which one segment becomes similar, 
in some respect, to another when the two are adjacent. Here, 
the assimilation is in place of articulation. Further evidence that 
nasals in English undergo place of articulation assimilation is 
not hard to come by. Consider the following data, which are 
representative of the speech of many speakers of English: 

(7) Nasal assimilation in English 

(a) 

[Apk h lio] 

unclear 

(b) 
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[AggDdli] 

ungodly 

(c) 

[Aiqfeo] 

unfair 

(d) 

[An)vselju:d] 

unvalued 

(e) 

[Ant h ru:] 

untrue 

(f) 

[AndAn] 

undone 

(g) 

[Ambeojobol] 

unbearable 

(h) 

[Ambaiost] 

unbiased 

While the [ 113 ] solution is plausible, it faces a difficulty: we 
might equally say that the phonological form of the morpheme 
is [in], or [im], and that, in either case, the nasal assimilates to 
a following consonant. On the evidence presented thus far, 
there is no non-arbitrary way of choosing between the three 
alternatives: each is as plausible as the others. The following 
data, however, allow us to make a non-arbitrary choice: 

( 8 ) 

(a) 

[iacktiv] 

inactive 
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(b) 

[inepiotiv] 

inoperative 

(c) 

[inefobl] 

ineffable 

(d) 

[inodvaizobl] 

inadvisable 

(e) 

[ino:dibl] 

inaudible 

(f) 

[incilionobl] 

inalienable 

In each case, there is no consonant at the beginning of the 
root to which the nasal could assimilate: each root begins with 
a vowel. From the fact that vowel-initial roots are realized with 
the [in] form, we can therefore conclude that the phonology of 
the prefix takes the form [in], and that the nasal does not 
change its place of articulation if the root-initial segment is a 
vowel or an alveolar consonant. Note that this is generally true 
of alveolar nasals in English, as the following data, involving 
the prefix seen in (7), suggest: 

( 9 ) 

(a) 

[Ancidod] 

unaided 

(b) 

[xnatNaektiv] 

unattractive 
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(C) 

[vniventfol] 

uneventful 

(d) 

[Ano:0odnks] 

unorthodox 

We might, of course, have said that the morpheme in 
question has four different phonological forms: /im/, /nr)/, /in/ 
and /rrj/, and that words such as impossible , infelicitous, 
indirect and incorrect are each stored mentally with the 
appropriate prefix. There are two problems with this approach. 
Firstly, there is no independent evidence that there is an /rrj/ 
phoneme in English ([rg] never functions contrastively with 
any other nasal). Secondly, even if there were no [nr)] forms, 
we would be committed to claiming, under the ‘several 
phonological forms’ approach, that it is mere coincidence that 
the /im/ form is attached only to roots beginning with a bilabial 
consonant, the /irj/ form only to roots beginning with a velar 
consonant, and the /in/ form only to roots beginning with 
alveolar consonants and vowels. But that is surely an 
implausible claim. So, for this sort of case at least, the idea that 
we should postulate more than one phonological form for a 
morpheme is deeply unattractive and implausible. Given the 
data we have seen thus far, it appears much more plausible to 
say that any given morpheme has a single phonological form. 
And if that is the case, then it is the phonologist’s task to 
hypothesize as to what that form might be. In doing so, she or 
he will be guided by evidence and argumentation: the facts of 
the matter, since they are mental in nature, and thus not 
directly observable, will not be available for immediate 
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inspection via the five senses. 

In adopting this ‘one phonological form per morpheme’ 
approach, we are allowing that, while any given moipheme has 
only one phonological form, that phonological form may 
‘correspond’ in some sense to a variety of different phonetic 
forms. In the case we have just looked at, the prefix has the 
phonological form [in], but that in turn corresponds to four 
different phonetic forms: [im], [irq], [in] and [irj]. Phonologists 
refer to such phonetic forms as alternants: we say that there is 
an alternation between the four forms. Which alternant of a 
given moipheme occurs in a given word is entirely predictable ; 
there is a generalization which captures that predictability, and 
we are able to express it in the form of a phonological rule, just 
as we did with [r] and [1] in Korean. In our English case, the 
rule in question concerns nasals in general; it might be put 
informally as: 

(10) The rule of nasal assimilation in English 

If the phonological form of a prefix ends in a nasal 
then that nasal will assimilate to the place of 
articulation of a following consonant. 

We could have formulated this generalization as a formalized 
rule, or as some kind of constraint on the phonological form of 
moiphemes in English. We will not go into the types of 
formalism required to express such generalizations, or inquire 
whether they are best expressed as rules or as constraints. The 
most important point is that native speakers appear to be in 
possession of generalizations of this sort, and that these appear 
to constitute a part of their largely unconscious phonological 
knowledge. 

The data we have just considered also exemplify an 
important phenomenon: that of phonemic overlapping. On 
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the basis of the data in (5), we postulated the following nasal 
stop phonemes, with the realizations shown: 

( 11 ) 


Iml 

1 

Ini 

1 


J 

i 

▼ 

[ml 

[n| 

[ril 


However, we have allowed that the Ini phoneme may also be 
realized as [m] if it precedes a bilabial consonant, or [p] if it 
precedes a velar consonant. This means that a given 
occurrence of [m], for instance, may be either a realization of 
/m/, as in map, or a realization of Ini, as in improbable. That 
is, the /m/ and Ini phonemes overlap in their realizations. We 
may depict this as follows: 

( 12 ) 


Iml Ini !t\! 



[nil [n] [rj] 


The question arises: how can the speaker of English tell 
whether a given [m] is a realization of Iml or of /n/? The 
answer is that the phonological context allows the speaker to 
tell: an [m] which does not precede a bilabial consonant will be 
a realization of the Iml phoneme. The phonemic contrast 
between Iml and Ini is said to be neutralized before a bilabial 
consonant. Neutralization is the suspension of phonemic 
contrasts in one or more specifiable contexts. 

6.3 English Vowel Phonemes 


120 


Accents of English vary considerably in their vowel phoneme 
systems and in the range of allophones that those phonemes 
have. We begin by depicting a set of postulated vowel 
phonemes for RP and GA. The RP and GA phonetic vowel 
qualities we presented and discussed in chapters 3 and 4 are 
typically contrastive for most speakers of those accents, and 
we may therefore postulate the following (stressed) vowel 
phonemes for RP and GA: 

(13a) 

RP vowel phonemes 

Wells (1982; see Suggested Further Reading) lexical sets 

/a/ 

as in putt 
STRUT 

tut 

as in put 

FOOT 

/u:/ 

as in pool, shoe 
GOOSE 

N 

as in pit 
KIT 

HJ 

as in peat, lea 
FFEECE 

/e/ 

as in pet 
DRESS 

/ei/ 
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as in pate, lay 
FACE 

/d/ 

as in pot 

LOT 

fool 

as in po/e, /ow 

GOAT 

/o:/ 

as in port, law 

NORTH, FORCE, THOUGHT 

/as/ 

as in 790 ? 

TRAP 

/d:/ 

as in 790 /t, Shah 
START, BATH, PALM 

/ 3 :/ 

as in 79 er?, furry 

NURSE 

/oi/ 

as in com, /?o;y 

CHOICE 

/ai/ 

as in 79//C, /my 

PRICE 

/au/ 

as in 790 m, cow 
MOUTH 
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/is/ 

as in fierce, leer 

NEAR 

/es/ 

as in scarce, lair 

SQUARE 

/us/ 

as in gourd, lure 

CURE 

(13b) 

GA vowel phonemes 

Wells (1982) lexical sets 

/a/ as in putt 

STRUT 

/u/ as in pnf 

FOOT, CURE 

/u:/ as in pool, shoe 

GOOSE 

hi as in pit 

KIT 

/i:/ as in peat, lea 
FLEECE, NEAR 
Id as in pet 
DRESS, SQUARE 
/ei/ as in pate, lay 
FACE 

/ou/ as in po/e, low 
GOAT 
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h'J as in law, short 
NORTH, FORCE, THOUGHT 
/as/ as in pat 
TRAP, BATH 

/d/ as in part, Shah, pot, caught 
START, PALM, LOT 
/ 3 / as in pert, furry 
NURSE 

/oi/ as in coin, boy 

CHOICE 

/ai/ as in pile, buy 

PRICE 

/au/ as in pout, cow 
MOUTH 

Again, what the set of RP or GA vowel phonemes might be 
is a matter for argumentation based on evidence and general 
theoretical considerations. For instance, we might have 
suggested that the second vowel in words like pew is a 
diphthong ([iu:]) and that, since pew forms a minimal pair with 
pie, pea, etc., then /iu:/ is an RP vowel phoneme. We will 
return, in due course, to this kind of question. We should also 
note that there is a further vowel phoneme which is not listed 
here: hi (schwa), which differs from all the other phonemes 
listed above, since it does not occur in stressed position (as we 
noted in chapter 3). We will also return, in due course, to hi 
and its relation to the phonetic segment [a]. 

Like consonant phonemes, vowel phonemes may have 
allophones. For instance, speakers of many accents of English 
have two realizations of the vowel phoneme /i:/: [i:] and [i:o]. 
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The latter typically occurs before a velarized lateral (‘dark P), 
as the following data show: 

(14) Allophones of /i:/ 

(a) 

m 

feet 

(b) 

[fi:ol] 

feel 

(c) 

[di:p] 

deep 

(d) 

[di:ol] 

deal 

(e) 

[p h i:k] 

peak 

(f) 

[p h i:ol] 

peel 

(g) 

[si:m] 

seem 

(h) 

[si:ol] 

seal 

We postulate /i:/ rather than /i:o/ as the form of the phoneme, 
since we assume that the realization of the phoneme when it 
precedes a dark 1 is influenced by the dark 1. The schwa 
articulation, which is retracted from the high front [i:] position, 
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is a matter of the vowel articulation assimilating to the tongue 
body retraction in the dark 1. In doing so, we appeal to the idea 
of phonetic motivation: our analysis is phonetically motivated 
in the sense that we can provide an articulatory reason for the 
/i:/ —> [i:o] process, whereas we would be unable to provide 
any such motivation for a process in which /i:o/ —* [i:] word- 
finally and before any consonant other than [1], 

We are also assuming, as we did in chapter 5, that the 
phoneme IV has two allophones, [1] and [1], We said there that 
there is an /l/ realization rule: IV is realized as [1] immediately 
before vowels, and as [1] immediately after vowels. 

These two claims appear to commit us to the idea that the 
rule governing the occurrence of [1] must, in some sense, 
‘precede’ the rule governing the phoneme /i:/, since we are 
claiming that [1] only ever arises as a result of the application of 
the IV rule, and that [i:o] only ever arises when an [1] follows. 
We may depict this claim about the interaction of the two rules 
as follows: 

(15) 

/fi:l/ 

IV rule 
fi:l 

li:l rule 
fi:ol 

This kind of depiction is referred to as a derivation: the 
phonetic realization of the phonological form /fi:l/ is derived 
from that phonological form by means of the ordered 
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application of rules. This way of looking at the relationship 
between phonological forms and their phonetic realization 
therefore commits us to the idea of rule ordering, and thus to 
a rule-based, derivational view of phonological organization. 
While we will not pursue this conception of phonological 
organization in any depth, it is as well to acknowledge that we 
are implicitly assuming such a conception. 

Another example of vowel allophones in English concerns 
vowel length. In many languages, vowel length is phonemic. 
For instance, in Limbu (spoken in Nepal), [sapma] means ‘to 
write’, but [sapma] means ‘to flatter’. Similarly, in Malayalam 
(spoken in Southern India), [ciri] means ‘smile’ but [ci:ri] 
means ‘shrieked’. But in other languages, vowel length is 
allophonic. In Scottish Standard English (SSE), for instance, 
some (not all) of the vowel phonemes have long and short 
allophones: 

(16) Long and short vowels in Scottish Standard English 


[hi:v] 

heave 

[bji:d] 

breathe 

[bji:z| 

breeze 

[bfcj] 

beer 

|bi: 1 

bee 



[bit] 

beef 

[hie] 

heath 

|flis] 

fleece 

[dill 

deal 

[bit] 

beat 



[ m«:v] 

move 

[ sm«:<5) 

smooth 

|t«:z] 

lose 

[b«:.i] 

boor 

[bl«:] 

blue 



[h«f] 

hoof 

[t h «e| 

tooth 

[l«s] 

loose 


The long allophones of the phonemes /i/, /u/ and /ai/ occur in 
the following contexts: at the end of a word or before one of 
the following voiced consonants: /v/, /S/, /z/ and Ixl. Vowel 
length is therefore allophonic, rather than phonemic, in SSE. 



Listen to sound 
files online 

We considered the case of the neutralization of the contrast 
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between the consonant phonemes /m/ and /n/ before bilabial 
consonants. Neutralization of contrast between vowel 
phonemes is also possible. An example comes from the 
contrast between /as/, Id and /ei/ in GA: these three phonemes 
are all realized as [s] before the hi phoneme. Listen to the 
sentence Marry Merry Mary on Track 6.1 at 
www.wiley.com/go/carrphonetics . The GA speaker utters [s] in 
each of the three words: they are homophones in GA. The 
other speakers are speaking SSE and RP respectively: each has 
distinct realizations for each of the three phonemes: the Marry 
Merry Mary neutralization does not occur in SSE or RP. 


Exercises 

1 Phonemic contrasts in GA, SSE and RP 
The following data sets are from General American (GA), 
Scottish Standard English (SSE) and RP (Received 
Pronunciation). On the basis of presence of minimal pairs 
in one variety vs absence of such pairs in another variety, 
identify any phonemic contrasts which are present in GA, 
but not in SSE, and vice versa. Then identify any 
phonemic contrasts which are present in RP, but not in 
GA, and vice versa. Finally, identify any phonemic 
contrasts which are present in SSE, but not RP, and vice 
versa. Assume that the data given here are fully 
representative of the varieties in question. 
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GA 

SSE 

RP 


(a) 

[weilz] 

[.wetz] 

[weilz] 

whales 

(b) 

[weilz] 

[wetz] 

[weilz] 

Wales 

(c) 

[lak] 

[bx] 

[Ink] 

loch 

(d) 

Hak] 

[bk] 

[lok] 

lock 

(e) 

(win] 

[Min] 

[win] 

whin 

(f) 

[win] 

[win] 

[win] 

win 

(g) 

[phi:*] 

[ P h «l] 

[p h u:i] 

pool 

(h) 

[p-ut] 

[ P h «l| 

[phil] 

pull 

(i) 

[saem] 

[sem ] 

[stem] 

Sam 

0) 

|sam| 

[sem] 

|sa:m] 

psalm 

(k) 

[hws] 

[hDJs] 

IhDCS] 

horse 

(1) 

[hojs] 

[hojs] 

[hacs] 

hoarse 

(m) 

(kS:t| 

[kSt] 

[k^xt] 

caught 

(n) 

[k h ot] 

[k-ot] 

[k-ot] 

cot 


2 Nasal stops in English and Spanish 


Recall that data such as the following led to our decision 
to postulate three nasal stop phonemes in English (/m/, Ini 
and /p/): 

English 

(a) 

[mid] 

meat 

(b) 

[ni:t] 

neat 

(c) 

[moul] 

mole 

(d) 

[nool] 

knoll 

(e) 
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[sin] 

sin 

(f) 

[sir)] 

sing 

(g) 

[dim] 

dim 

GO 

[dm] 

din 

(i) 

[win] 

win 

0 ) 

[wig] 

wing 

Now examine the same three nasal stops ([m], [n], [p]) in 
the following data from Spanish. (We ignore the fact that 
Spanish has a dental, rather than an alveolar, nasal stop. 
These data are representative of Castilian Spanish, the 
prestige accent spoken in Spain: many varieties of Spanish 
lack the /0/ and /6/ phonemes. This fact does not affect 
the point made in the exercise.) Assume that the data are 
fully representative: 

Spanish 

(a) 

[mudo] 

‘mute’ 

(b) 

[nudo] 
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‘knot’ 

(c) 

[mete] 

‘goal’ 

(d) 

[nets] 

‘pure’ (feminine) 

(e) 

[ombre] 

‘man’ 

(f) 

[semblnr] 

‘to seem’ 

(g) 

[nndnr] 

‘to go’ 

GO 

[nnte] 

‘in the face of’ 

(i) 

[npgulo] 

‘angle’ 

0 ) 

[lspgwn] 

‘language’ 

Note. [ 13 ] does not appear word-initially, word-finally or 
between vowels. 

(i) 

What evidence is there for postulating distinct /m/ and Ini 
phonemes in Spanish? 

(ii) 
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Are there any grounds for postulating an /g/ phoneme in 
Spanish? What evidence is there for your answer? 

Now look at the following Spanish data: 

(k) 

[feliO] 

‘happy’ 

( l ) 

[kgfeliO] 

‘unhappy’ 

(m) 

[posible] 

‘possible’ 

(n) 

[imposible] 

‘impossible’ 

(o) 

[dispensable] 

‘dispensible’ 

(P) 

[indispensable] 

‘indispensible’ 

(q) 

[konOebible] 

‘conceivable’ 

( r ) 

[igkonOebible] 

‘inconceivable’ 

(s) 

[Bkostombmdo] 

‘accustomed’ 

(t) 

[inBkostombmdo] 
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‘unaccustomed’ 

(iii) 

Is there any phonemic overlapping of the Spanish nasal 
stop phonemes? Explain. 

0 

3 Further phonetic transcription practice 
Listen to Track 6.2 . Transcribe, with as much phonetic 
detail as possible, the words you hear on the recording. 
The words are: 

unkind unbroken unaided reed reel 
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7 


English Syllable Structure 


7.1 Introduction 

We have said morphemes are a kind of mental representation 
which have three properties: a syntactic category, a meaning 
and a phonological form. We have allowed, thus far, that the 
phonological form of a moipheme is present in the speaker’s 
mentally constituted grammar, and that this phonological form 
consists in either a single phonological segment or a sequence 
of such segments. But this is only part of the story: there is 
more to the phonological form of a moipheme than that. There 
is evidence that those segments are organized into 
phonological constituents, rather in the way that words are 
organized into syntactic constituents (such as phrases and 
sentences). One of those constituents is the syllable. The 
evidence for the existence of the syllable comes largely in the 
form of phonological generalizations which cannot be 
adequately expressed without reference to the notion ‘syllable’. 
The aim of this chapter is to examine the structure of the 
syllable in English, and exemplify some of the sorts of 
phonological generalization which are best expressed in terms 
of that structure. 
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7.2 Constituency in Syllable 
Structure 

The two main constituents within a syllable are the onset and 
the rhyme. In the word bile, for instance, the first segment, 
/b/, constitutes the onset of the syllable and the last two 
segments, /ai/ and IV, taken together, constitute the rhyme. The 
onset is defined as any and all consonants occurring before the 
vowel. What evidence is there for this division between onset 
and rhyme? The device of alliteration depends on identity of 
onsets, independently of the content of the rhyme, as in little 
and light, poor and packed, and so on. This constitutes 
evidence for the onset/rhyme division, and thus evidence that 
the rhyme is a well-founded syllabic constituent. Since that is 
so, then the onset as a constituent is equally well-founded 
(since the two are defined in contradistinction to each other). 
Slips of the tongue also show that the onset is a real unit in 
speech production. One type of slip of the tongue is the 
spoonerism, named after an academic called Spooner, who is 
said to have uttered sentences such as ‘You have missed my 
history lecture’ as ‘You have hissed my mystery lecture’, with 
an inversion of the onsets of ‘missed’ and ‘history’. 

The rhyme may be further subdivided into the constituents 
nucleus and coda. Thus, in the word bile, the diphthong /ai/ 
constitutes the nucleus, and the consonant /l/ constitutes the 
coda. We may represent the constituency of the single-syllable 
morpheme bile as follows, where Greek ‘o’ (sigma) stands for 
‘syllable’, ‘O’ stands for ‘onset’, ‘R’ stands for ‘rhyme’, ‘N’ 
stands for ‘nucleus’, and ‘C’ stands for ‘coda’: 
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(1) bile: 


o 


/\ 


0 


R 



N 


C 




ai 


A syllable such as this, which contains one or more 
consonants in coda position, is called a closed syllable, 
whereas a syllable which does not contain any consonants in 
coda position is referred to as an open syllable; as in the word 
buy. 


(2) buy. 


o 



O R 


N 


) 


While a syllable must have a nucleus, it is possible to have a 
well-formed syllable which does not contain any element other 
than a nucleus. The segment occupying the nucleus of the 
syllable is normally a vowel. An example of a word in English 
consisting of only one syllable, which in turn contains only a 
nucleus, is eye: /ai/. But the nucleus of a syllable in English 
may be preceded or followed by other segments, as we have 
seen, and those segments are typically consonants. In the word 
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aisle, for instance, the nucleus is followed by a consonant in 
coda position: /ail/. In the word buy, the nucleus is preceded by 
a consonant in onset position: /bai/, and in the word bile, the 
nucleus is both preceded and followed by consonants: /bail/. 

Morphemes like bile, which contain only one syllable, are 
said to be monosyllabic. In some languages, all morphemes 
are monosyllabic. But in English, moiphemes may contain 
more than one syllable: they may be polysyllabic. Examples 
are rider, beetle, amount, desire (which are bisyllabic), 
elephant, veranda, kangaroo (which are trisyllabic), 
independent, America (which have four syllables) and so on. 

In some languages, all syllables must contain an onset 
consonant but, as we have seen, in English (and this is true of 
many other languages), this is not the case. For reasons to be 
explained later (connected with the notion of 
^syllabification’), we will represent such syllables with an 
empty onset, as follows: 

(3) it: 

a 


O R 



N C 


i t 

In many languages, such as Hawaiian, onsets may contain a 
single consonant only, but in many others, English included, 
onsets may contain two segments (as in bring, trap, clip, etc.); 
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we will refer to these as branching onsets, and represent them 
as follows: 

(4) clip: 

a 


O R 

N C 

k 1 i P 

Just as onsets may be branching, so codas may branch, as in 
the word hunt: 

(5) hunt: 

a 




O R 



N C 


I A 

i Ant 

The distinction between, on the one hand, short vowels and, 
on the other, long vowels and diphthongs can be represented 
by taking the latter class of vowels to occupy a branching 
nucleus, with the former class occupying a non-branching 
nucleus. To represent the fact that long vowels and diphthongs 
are longer than short vowels, we say that segments are 
attached to a series of timing slots, referred to as the skeletal 
tier. The idea is that one can represent the difference between 
short vowels on the one hand, and long vowels (including 
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English diphthongs) on the other, by taking the former to be 
connected to a single skeletal slot and the latter to be connected 
to two skeletal slots. Thus, bit has a short vowel as its nucleus 
and is therefore represented with a non-branching nucleus, 
whereas bee and buy have branching nuclei: 

(6) bit : 

a 


O R 

r\ 

N C 


b i t 

(7) bee\ 


a 



O R 


N 

A 

XXX 

I V 

b i 

(8) buy : 
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o 


O R 

N 

A 

X X X 

b a i 

What is intended by the representations in (7) and (8) is that 
long vowels are constituted as a single vowel quality which is 
attached to two skeletal slots, whereas long diphthongs, as in 
buy, have two different vowel qualities. The point is that nuclei 
with long vowels and with diphthongs are parallel with respect 
to the number of timing slots within the nucleus. We will 
henceforth adopt the skeletal tier in our representations of 
syllabic structure. 

The skeletal tier enables us to say that affricates, which, as 
we have seen, have a closure element and a fricative release 
element, as in [tj] and [d3], are complex segments, since they 
behave like single segments (they occupy a single unit of 
timing) while having an internal structure which resembles two 
segments: 

(9) chip : 
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0 



N C 


XXX 

A I I 

t J I p 

7.3 The Sonority Hierarchy, 
Maximal Onset and Syllable 
Weight 

You will agree, if you are a native speaker of English, that 
/bliqk/ is a well-formed syllable (and happens to constitute the 
phonological form of an English word), as are the following: 
/bluqk/, /blsqk/, /btegk/ and /blscrjk/ (most of which happen 
not to constitute the phonological form of English words). That 
is, your native speaker knowledge of English allows you to 
judge that these are syllabically well-formed, even though there 
are no words in English which have those phonological forms. 
That unconscious knowledge also allows you to judge that the 
following are ill-formed: /ibirjk/, /ibikrj/, /tlink/ and /blaimp/. 
The question is: what form does this unconscious knowledge 
take? What is it that we know, unconsciously, which allows us 
to make these judgements? Let us now seek to answer that 
question. 
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It is widely believed that there are both universal and 
language-specific constraints on the form that syllables may 
take, that is, constraints on the syllabification of sequences of 
segments. Among the universal constraints, we may mention 
two. Firstly, it is claimed that sequences of segments are 
syllabified in accordance with a sonority scale, which takes 
the following form: 

(10) Sonority scale 
Low vowels 
High vowels 
Approximants 
Nasals 

\biced fricatives 
\biceless fricatives 
\biced stops 
\biceless stops 

The idea is that, as one proceeds from the bottom to the top 
of the scale, the class of segments becomes more sonorous, or 
more vowel-like. Sonority is an acoustic effect: the more 
sonorous a sound, the more it resonates, \bwels have greater 
resonance than consonants, and voiced consonants have 
greater resonance than voiceless ones. If you listen to a singer 
holding a note for any length of time, the sound in question will 
most probably be a vowel. There are two articulatory reasons 
why it is easier to hold a vowel sound for longer than a 
consonant sound, and both are relevant to the production of 
sonority. The first is degree of constriction, as discussed in 
chapter 1, whereby stops are said to involve a greater degree of 
stricture than fricatives, which in turn involve greater 
constriction than approximants and vowels. Similarly, the more 
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open a vowel articulation, the less stricture there is in the oral 
cavity. The acoustic effect of these sorts of articulation is that 
the lesser the degree of constriction, the greater the degree of 
sonority. The second articulatory factor is voicing : voiceless 
segments are less vowel-like, less sonorant, than voiced 
segments: vowels are typically voiced, and voicing creates 
greater sonority. 

Applied to syllable structure, the idea is that the most 
sonorous element in a syllable will be located within the 
nucleus, and that the further one gets from the nucleus, the less 
sonorous are the segments. Thus, in blink, the /b/ is less 
sonorous than the /!/, which is, in turn, less sonorous than the 
vowel: as one approaches the nucleus, so sonority increases. 
As one leaves the nucleus, we may note that the /p/ is less 
sonorous than the vowel, and the /k/ less sonorous in turn than 
the preceding /r)/. 

The ‘degree of sonority’ idea is very convincing, even if it 
runs into some difficulties. For instance, [s] + consonant onset 
clusters in English undermine the predictions made by the 
sonority hierarchy, since, in cases such as sprint, the sonority 
scale principle makes the right predictions except with respect 
to the initial [s]. However, this merely serves further to 
underline the peculiarity of English sC (s + consonant) onset 
clusters: only [s]-initial onsets violate the sonority hierarchy, 
and the only three-way branching onsets in English are those 
which begin with an [s]. 

Another universal principle of syllabification concerns the 
syllabification of polysyllabic words, and is referred to as the 
principle of Maximal Onset. We have considered only 
monosyllabic words thus far; let us therefore consider the 
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syllabification of the English word appraise, whose segmental 
form is, let us say, /opaciz/. It is clear that the word is 
bisyllabic; the question is where the boundary between the 
syllables lies. We know that /p/ may occur in coda position in 
English, as in cap, cup, etc. We also know that /pa/ is a well- 
formed onset, as in prize, preen, etc., and we know that hi 
may occur alone in onset position, as in rice, raze, etc. 
Furthermore, we know that /pa/ is not a well-formed coda 
cluster: it violates the predictions of the sonority hierarchy. 
Thus, /u:pi/, /sipa/ etc. are ill-formed. We must therefore 
decide whether the syllabification of appraise is /o.paciz/ or / 
op.aciz/ (where the full stop indicates the syllable boundary). 
The principle of Maximal Onset says that, in cases like this, 
where the language-specific phonotactics will allow for two or 
more syllabifications across a syllable boundary, it is the 
syllabification which maximizes the material in the following 
onset which is preferred. In this case, that is the former 
syllabification. 

The principle of Maximal Onset is intimately connected with 
a universal fact about syllable structure: that syllables with an 
onset consonant are in some sense more basic than those 
without, and that presence of onset consonants is in some 
sense more basic than presence of coda consonants. It appears 
that the most ‘basic’ syllable structure in human languages is 
CV syllable structure, with a single onset consonant followed 
by a vowel. There are several types of evidence for this claim. 

Firstly, CV-type syllables appear to be the syllable types that 
human children first utter when they begin to speak (e.g. [ba], 
[ma]) regardless of what language their parents speak. At that 
stage in the development of the child’s syllable structure, 
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syllables in the adult language with branching onsets will be 
uttered as CV structures. So too will syllables with coda 
consonants: the coda consonants will simply be absent at that 
stage. This strongly suggests that onset consonants are in some 
way more basic, in articulatory, and perhaps perceptual, terms 
than coda consonants. 

Secondly, in many cases of aphasia, where post-stroke 
patients have suffered damage to their speech, CV syllable 
structures also appear to be the sort that first begins to appear 
as the patient recovers his or her speech, even if his or her 
native language has branching onsets and coda consonants. 

Thirdly, languages which have both onset and coda 
consonants typically allow for a wider range of consonants to 
occur in onset position than in coda position. 

Fourthly, coda consonants are much more likely to undergo 
loss of articulation in the course of the historical development 
of languages than onset consonants. This is what has happened 
with IV in coda position in some varieties of English, where the 
realization of IV has become vocalized ([w], which is vowel¬ 
like, rather than consonantal) in coda position, but not in onset 
position, so that [1] occurs in words like let and play, where the 
IV is in an onset, but [w] occurs in words like feel and felt, 
where the IV occurs in coda position (except in cases where 
words such as feel are followed by a word or suffix beginning 
with an empty onset, in which case the IV occupies that 
position and is realized as [1]; see 7.7 below on 
resyllabification). This kind of weakening of articulation can 
lead to complete elision (non-pronunciation) of a consonant. 
This is what has happened with [j] in coda position in many 
accents of English. In those accents, words like car and card 
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have lost the coda [j], while retaining it in onsets, as in words 
like run and bring. 

Such cases of articulatory weakening, often leading to 
complete loss of articulation, of coda consonants abound in the 
world’s languages. They suggest that coda consonants are 
somehow less salient in perception than onset consonants, and 
studies in the way that human beings retrieve phonological 
forms from mental storage suggest greater prominence for 
onset consonants than for coda consonants: if one is searching 
for a word in one’s lexical memory, one is more likely to 
search on the basis of onset consonants than of coda 
consonants. 

Fifthly, there are no known languages which have VC-type 
syllables but lack CV-type syllables, whereas the reverse is not 
the case. This strongly suggests that CV syllables are more 
basic than VC, or indeed any other, syllable type. 

This generalization about CV syllable structure probably has 
a basis in both articulation and perception. If you try to 
produce a word with an empty onset in isolation (e.g. the word 
eye), you will find it hard to do without uttering some kind of 
consonantal articulation (typically, a glottal stop) before you 
utter the vowel. Preference for filled, rather than empty, onsets 
is probably rooted in the nature of our articulatory apparatus 
and also tied to greater perceptual salience of onset consonants. 

Given the principle of Maximal Onset, it is clear that, in a 
syllable such as the first syllable in appraise, the rhyme 
contains a short vowel (dominated, of course, by a single 
skeletal slot) and does not contain a coda, thus: 

(11) appraise: 


I46 



o 


<7 



OR OR 


X XXX X X 

a p i e i z 

Syllables such as the first syllable in appraise, in which there 
is no branching within the rhyme, either at the level of the 
rhyme node itself, or within the nucleus, are called light 
syllables. And syllables which have branching anywhere within 
the rhyme constituent are called heavy syllables. This 
distinction in syllable weight is said by some to be important 
in understanding the nature of word stress in English. 

There are two generalizations about word stress in English 
which some phonologists make. The first is that any stressed 
syllable in English is very likely to be a heavy syllable. The 
second is that monosyllabic words may not end in one of the 
short vowel phonemes (/o/, III, /as/, Id, /a/ or /a/), since a 
nucleus containing only one of those vowels, with no coda 
consonant, is light, and if a monosyllabic word is to be 
stressed, there is no choice as to which syllable it will be 
stressed on. 

7.4 Language-Specific 
Phonotactics 
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Let us now consider some language-specific constraints on the 
sequences of segments which may be combined in syllable 
structure, known as language-specific phonotactic constraints 
(phonotactics, for short). 

We have allowed that, in English syllables, onsets, nuclei, 
rhymes and codas may branch. But we have not said whether 
there is any limit on the number of branches they may have. 
Only one sort of English onset exceeds binary (two-way) 
branching: /s/ + consonant + {1)1, /w/ or 111 ) onsets, as in spew, 
square and scream. Note that the range of segments which 
may form the third element in such sequences is even more 
restricted than those in binary branching onsets. 

As we have seen, onsets may branch in English, but if they 
do, there are phonotactic constraints on the form they may 
take. Ignoring the /s/ + consonant cases, we may say that the 
first segment must be a stop or a fricative and the second must 
be 111, IV, 1)1 or /w/. Thus /pj/, /pi/, /pj/, Ibil, PoV, /bj/, liil, /tw/, 
/da/, /dw/, /la/, /kl/, /kw/, /9i/, /9w/, /fa/, /fl/, /fj/, /si/, /sj/ and 
/sw/ are all permissible. This list reflects other onset 
phonotactics. For instance, /t/, /d/ and /9/ may not be followed 
by IV, and none of the voiced fricatives may occur in branching 
onsets. 

Among the phonotactic constraints on rhymes in English, we 
may note the following. Firstly, /h/ does not occur in rhymes in 
English. Secondly, in many accents of English, hi does not 
occur in rhymes either; so that words like farm and car 
arguably have phonological forms such as /fa:m/ and /ka:/, 
without an 111. Accents which lack hi in rhymes are referred to 
as non-rhotic accents; they include Australian English, New 
Zealand English, RP, South African English, most of the 
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accents of the North of England, and the Southern and Eastern 
accents of the United States. These accents were rhotic at one 
stage; [i] has been lost in rhymes in those accents. Rhotic 
accents, which have not undergone this historical change, 
include GA, the accents of English spoken in Scotland, and 
some accents spoken in the South West of England. We will 
discuss such accents in more detail later. 

The overall shape of syllables in a language often acts as a 
major factor in adult second language acquisition. For instance, 
simplifying somewhat, Japanese syllable structure does not 
allow for branching onsets. This often has the effect that, when 
native speakers of Japanese utter English words with complex 
onsets, such as screw, they will tend to insert a vowel after 
each of the first three consonants, thus rendering the word 
trisyllabic: [suikuirai]. This process of vowel insertion is 
known as vowel epenthesis, and such vowels are known as 
epenthetic vowels. Similarly, and again simplifying somewhat, 
Japanese syllable structure does not allow for word-final coda 
consonants, so that English loanwords, such as cake, which 
end in a coda consonant in English, tend to be uttered as 
bisyllabic words ending in a vowel: [keki]. 

Similar cases abound. For instance, Spanish, unlike English, 
does not have words beginning with an s + consonant onset. 
However, Spanish does have words such as Espana which, in 
some cases, correspond to English words with an s + 
consonant onset (in this case, the word Spain). One of the 
effects of this is that Spanish speakers tend to insert an [s] 
before English words beginning with s + consonant clusters, as 
in [espein] {Spain). Similarly, English loanwords in Spanish are 
pronounced with such an epenthetic [s], as in [esmokin] 
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(‘ smoking j acket ’). 


7.5 Syllabic Consonants and 
Phonotactics 

It is possible for consonants to form the nucleus of a syllable in 
the speech of English speakers, particularly as the rate of 
speech increases. These consonants are called syllabic 
consonants. Three alternative pronunciations of the word 
bottle, for many speakers, are [bcitol], [batl] or [ba?l]. In the 
latter two pronunciations, the final unstressed vowel (schwa) 
has been lost, but the word still has two syllables, with the 
lateral becoming syllabified. Syllabic consonants are transcribed 
by means of the ‘syllabic’ diacritic, placed under the 
appropriate consonant symbol. 

Syllabic nasals are common in many varieties of English. An 
example is the word button, which has two syllables. For many 
speakers of English, it may be pronounced [b,\ton] or [b,\Vn], 
where the second pronunciation has a syllabic nasal. The 
second vowel in the first pronunciation is an unstressed vowel 
(schwa) which may be Tost’, particularly in faster or more 
casual speech. A similar example is the word happen, which, 
for many speakers, has (at least) the two pronunciations 
[hsepon] and [hse?m]. Here, the nasal [n] assimilates to the 
‘intended’ bilabial articulation [p], which in turn is articulated 
as a glottal stop. Other examples involving nasals are [is?p] vs 

[jskon] (reckon), and [knu:] vs [konu:] (canoe). 

A similar example involving the approximant [j] is the word 
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parade, which often has the alternative pronunciations [poacid] 
and [pieid]. It is typically nasals, laterals and [j] which undergo 
syllabification in English, although fricatives may also be 
syllabified, as in some pronunciations of, for instance, support, 
which may be [sop h o:t] or [sp h o:t] (as distinct from sport: 
[spo:t], with one, not two, syllables). 

In English (but not in some other languages), for every case 
in which a syllabic consonant may occur, there will be an 
alternative pronunciation of the word with a vowel preceding 
or following the syllabified consonant. 

All of these words have phonological representations 
containing a vowel in the nucleus of each syllable, as in canoe : 

(12) canoe : 

a a 


O R 

N 

x x 

l 3 


O R 


x 


n 


A 

X X 

V 


u 


It is striking that, although English speakers frequently utter 
words such as canoe with a syllabic nasal, as in [knu:], when 
faced with non-English words such as gnu, they will tend to 
insert an epenthetic vowel, uttering the word as [gonu], making 
it conform to English syllable structure, in which /gn/ is not a 
permissible English syllable. The important point to be made 
here is that constraints on English syllable structure are defined 
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in terms of permissible phonological sequences, rather than 
phonetic ones. 

We have said that English does not allow for phonological 
syllabic consonants, but many other languages do. The 
Polynesian language Maori, and many Bantu languages, for 
instance, allow for phonologically syllabic nasals. Examples are 
Bantu names such as Mbabande, Ndola and Nkomo, each of 
which begins, phonologically, with a syllabic nasal, as in 
[ijkomo], which has three syllables. Native speakers of English 
will tend to utter such words with an epenthetic vowel placed 
adjacent to the relevant nasal, as in [mkomo] or [iijkomo], thus 
making the phonetic sequence conform to the English 
phonological pattern. All of these cases provide evidence for 
the phonological vs phonetic distinction we have drawn, and 
show how profound an influence our phonological 
representations can have on our perception and production of 
non-native words. 


7.6 Syllable-Based 
Generalizations 

We said in 7.1 that some of the evidence for the existence of 
the syllable as a phonological constituent comes from the fact 
that there are significant phonological generalizations which 
cannot be adequately expressed without appeal to syllable 
structure. One such generalization concerns the distribution of 
velarized laterals (‘dark l’s) in many accents of English. For 
many speakers, the following sort of distribution between 
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velarized and non-velarized IV may be found: 

(13) Velarized and non-velarized /V 

[Lvl] lull [li:f] leaf [sli:p] sleep 

[bDtl| bottle [pksi] peel [ milk] milk 

[lili] lilly [ 1 1 11 1 n 1 lilting [fadta] falter 

One might attempt to state the distribution of [1] and [1] as 
follows: [1] occurs when immediately followed by another 
consonant, or at the end of a word (i.e. when immediately 
followed by a word boundary). But one of the objections to 
this formulation is that it is not clear what, if anything, a 
following consonant and a word boundary might have in 
common. A simpler statement of the distribution, which does 
not entail appeal to this peculiar disjunction of environments, is 
to say that [1] occurs in the rhyme of a syllable, and [1] in the 
onset. Indeed, we might take that syllable-based account of the 
distribution to help us diagnose the syllabic status of the second 
IV in lily : we might say that, because the second N is not 
velarized in the speech of many speakers, this confirms our 
claim that it occupies onset position in the second syllable of 
lily, rather than coda position in the first syllable. 

Let us consider another example, from London English, of 
the syllable-based nature of some phonological generalizations. 
The vowels [no] and [au] are said by some phonologists to be 
in complementary distribution in the speech of many speakers 
of London English. The following table exemplifies thi s : 

(14) [no] and [ao] in London English 

pool] roll [RauIa] cola 

[stioul] stroll [lAud] load 

[oold] old [tombAuk] tombola 

The claim is that the [ao] phoneme is realized as [no] when 
it is followed by an IV which is in the same syllable as the [ao] 
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(i.e. when followed by a tautosyllabic IV), and as [ao] 
elsewhere. Thus, in the words in the left-hand column, we 
have [no]. In the right-hand column, the vowel in load clearly 
lacks a following tauto syllabic /!/; as for cola and tombola, we 
would want to argue, from the principle of Maximal Onset, that 
the /1/s there occupy onset position in the following syllable, 
and thus that the [ao] there also lacks a following tautosyllabic 
IV. 

7.7 Morphological Structure, 
Syllable Structure and 
Resyllabification 

The case just cited is a little more complex than we have, thus 
far, suggested. Consider the following further data, from the 
same accent: 

(15) 

[jduIa] roller [lAuhnd] Roland 

[hoolij holey [h.\uli] holy 

What is the phonological status of the diphthong in roller ? 
On the one hand, we have said that the [no] allophone appears 
before a tautosyllabic IV. On the other hand, it would appear 
that Maximal Onset would have us syllabify the IV into the 
onset position of the second syllable. But if the !V is indeed in 
that position, then we ought to get the [ao] allophone. Why, 
then, do we not get that allophone? Let us consider two 
possible responses. 
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One response is to say that speakers of this accent originally 
had a straightforward phonological rule, of the sort we have 
given, for the realization of the [au] phoneme, but that, as the 
accent has evolved, a phonemic split has emerged: the two 
vowels were in complementary distribution, but have come to 
occur in overlapping, parallel distribution. Evidence which is 
cited in support of this view is the emergence of minimal pairs, 
such as holey vs holy. Here, it is argued, we have clear 
evidence that a phonemic split has occurred. 

We might make the following objection to the analysis just 
cited: it is failing to take note of an important generalization 
concerning the members of minimal pairs such as holy and 
holey, namely that those containing the [au] vowel consist of 
only one morpheme (they are morphologically simple ), 
whereas those containing [nu] consist of more than one 
moipheme (they are morphologically complex ). Furthermore, 
in each case, the relevant vowel occurs before an IV which, in 
the morphologically complex cases, is morpheme-final. 

One way of expressing this sensitivity of the phonological 
rule to morphological structure is to say that the rule applies 
prior to the affixation of the suffix in cases like holey, as 
follows: 

(16) root: /hAul/ 
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h a u 

(17) Application of [au] rule 

[au] — > [no] before tautosyllabic l\l 

(18) Affixation 


a a 



OR OR 



N C N 


A I I 

X X X X X 

h o o i 

(19) Resyllabification 
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This analysis requires appeal to a further notion: that of 
resyllabification. The idea is that, while the IV is initially 
syllabified into coda position, it is resyllabified, after affixation, 
and according to the Maximal Onset principle, into the empty 
onset in the suffix (the idea of resyllabification into an empty 
onset position being part of the motivation for postulating 
empty onsets). 

There is an important point to be made with respect to cases 
such as this. It is that there may be cases where a phonetic 
distinction acts as the basis for minimal pairs (in this case, pairs 
like holy/holey) where, nonetheless, we do not wish to 
postulate a phonemic distinction between the two segments. In 
such cases, the members of those pairs will almost always 
differ in their morphological structure (in this case, holy is 
morphologically simple, while holey is morphologically 
complex), and that difference will affect the application of 
generalizations which govern the allophones of phonemes. 

Notice that, in the case of many non-London accents, the 
reverse ordering of phonological generalizations and 
resyllabification applies in the case of the generalizations which 
yield the [i:] and [i:o] allophones of the /i:/ phoneme and the 
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clear and dark allophones of the IV phoneme. We saw that, in 
many accents, in words like peel and feel, the allophone of /i:/ 
is [i:o], and the IV allophone is [1], as in [fi:ol] (feel). In cases 
like feeling, the IV is resyllabified into the empty onset of the 
suffix -ig, thus: [fi:hg]. We want to say that the rule (as defined 
on p. 64) which governs the allophones of the IV phoneme, and 
the rule which governs the allophones of the HJ phoneme, 
apply after resyllabification. 

We will take the view that, while the phonemic principle, in 
distinguishing between contrastive and non-contrastive 
distinctions, embodies an important insight into the nature of 
phonological organization, its conception of phonological 
contrast is at times overly restrictive, since it is defined without 
reference to the influence of morphological and syntactic 
factors on phonological organization. We will therefore, at 
times, allow our analyses to, as it were, override the phonemic 
principle. In doing so, we are not abandoning the notions which 
find a place in that principle; rather, we are allowing that 
morphological factors may influence phonological processes. 

7.8 Summing Up 

We have adopted here an account of the syllable as a 
phonological constituent. We have said that the sub-parts of 
syllables have differing degrees of perceptual salience, so that 
the nucleus is more salient than the other parts of the syllable. 
This is perhaps why, in many languages, coda consonants so 
often dimini s h in degree of stricture, to the point of ‘fading 
away’ altogether. This is what has happened to the realizations 
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of the 111 phoneme in many accents of English, for instance: at 
one stage in their history, 111 was realized in coda position. It is 
also what happened to many case endings (suffixes which 
consisted of a vowel and one or more coda consonants) in the 
history of English nouns (English nouns used to have case 
endings such as -um, -am, -ans, -uns, all of which have 
disappeared over time). 

The loss of case endings in English nouns involved another 
factor, however: not only are nuclei more salient perceptually 
than coda consonants, but, in languages like English, some 
nuclei in a word (stressed nuclei) are more salient than others. 
It is to the subject of word stress in English that we now turn. 


Exercises 

1 For each of the following words, say how it is 
syllabified, and why alternative syllabifications are 
disallowed (e.g. the word /kwontiti/ ( quantity ) is 
syllabified as /kwon.ti.ti/; the syllabification /kwn.nti.ti/ 
violates English phonotactics, since /nt/ is not a 
permissible branching onset; the syllabification /kwnnt.i.ti/ 
is ruled out by the principle of Maximal Onset): 

(a) 

/jovaiz/ 

( revise ) 

(b) 

/pjodikjbn/ 

( prediction ) 

(c) 
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/jszidsnjbl/ 

(. residential ) 

(d) 

/empoio/ 

{emperor) 

(e) 

/d 3 sepani:z/ 

{Japanese) 

(f) 

/kundAkt/ 

{conduct) 

2 Of the following monosyllabic phonological 
representations, say which are English, and which are 
non-English, representations for monosyllabic words. For 
each non-English form, say why it is not possible: 

(a) /pmt/ 

(b) /psit/ 

(c) /pint/ 

(d) /plst/ 

(e) /pill/ 

(f) /ipaet/ 

3 Examine the following data from the Cockney variety of 
London English: 

(a) 

[bi?] 

light 

(b) 

[laidi] 

lady 
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(C) 

[fi:hn] 

feeling 

(d) 

[fnw] 

feel 

(e) 

[g3:w] 

girl 

(f) 

[boVu] 

bottle (bisyllabic) 

(g) 

[filA] 

filler 

GO 

[filW?A] 

filter 

(i) 

[wai] 

way 

0 ) 

[wei?] 

wet 

The sound [w] may be a realization of the /w/ phoneme, 
as in (i) and (j). However, it may also be an allophone of 
the IV phoneme, as in (d), (e) and (h). In (f), /l/ becomes a 
syllabic ‘[w]’, namely [u]. That is, there is phonemic 
overlapping between IV and /w/. In which contexts does 
the [w] allophone of IV occur? Your answer should be 
expressed in terms of syllable structure. 


161 



Listen to sound 
files online 

4 Further transcription practice 

Listen to Track 7.1 at www.wilev.com/go/carrphonetics . 
Transcribe, with as much detail as possible, the words you 
hear on the recording, indicating syllable boundaries (with 
a full stop) and any syllabic consonants (e.g. battle : 
[bse.?l], for some speakers): 

hunting kettle derived university wrecking suppost 




8 


Rhythm and Word Stress in 
English 

8.1 The Rhythm of English 

Human beings speak rhythmically, they engage in the act of 
speaking by putting regular beats in the speech signal. You can 
hear those beats in an English utterance such as The man went 
to the bar. Here, the beats are on man, went and bar. In most 
varieties of English, we do not necessarily place a beat on 
every single syllable. In this utterance, no beat falls on the 
preposition to, or on the two occurrences of the. This is 
because English, unlike certain other languages, is stress-timed: 
the rhythmic beats fall only on stressed syllables. In our 
example, only man, went and pub are stressed, so the beats fall 
only on those. English is unlike many other languages in this 
respect. Take the phrase Chicken MacNuggets, the name for a 
product sold by a well-known fast-food company. This is 
pronounced [dJikonmok'nAgots]. There are two stressed 
syllables in this sequence. (The second is more prominent than 
the first: we’ll come back to that.) The sequences ['t/ikonmok] 
and ['nvgots] form rhythmic units in the utterance. Those units 
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are called metrical feet. A metrical foot in English consists of a 
stressed syllable followed by zero or more unstressed syllables. 
In our example, the first metrical foot contains a stressed 
syllable and two unstressed syllables: [.tfikonmok]. The second 
metrical foot contains a stressed syllable and one unstressed 
syllable: [ nAgots]. Notice that divisions between the metrical 
feet need not coincide with word boundaries: the word 
boundary falls between the words Chicken and the word 
MacNuggets. But the rhythmic boundary falls between 
[.tfikonmok] and ['nAgots], We call these metrical feet 
trochaic. This is an adjective derived from the noun trochee. A 
trochee is essentially a stressed-unstressed sequence, such as 
['nAgots]. We will examine these trochaic metrical feet in more 
detail in chapter 9. For the moment, we need only note that 
word stress patterns are part and parcel of the rhythmic 
structure of English. 

8.2 English Word Stress: Is It 
Entirely Random? 

We have already noted that the native speaker’s perceptual 
capacities allow him or her to say how many syllables a word 
has, in the absence of any conscious knowledge of what a 
syllable might be, or how it might be defined. Similarly, English 
speakers can tell which syllable in a word receives most stress, 
in the absence of any conscious knowledge of exactly what 
‘stress’ might be. While the native speaker may not know 
consciously what stress is, it seems clear that, the more 
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stressed a syllable is, the more salient it is, perceptually. For 
instance, most native speakers of English will agree that, in the 
word photography, it is the antepenultimate (third last) syllable 
(the one before the penultimate, or second last, one) which is 
most stressed. Equally, most speakers will know that, in 
kangaroo, it is the last of its three syllables which receives most 
stress, and so on. It is equally striking that the native speaker 
can judge that, while the final syllable in kangaroo receives 
more stress than either of the others, the antepenultimate (third 
from last) syllable in turn receives more stress than the 
penultimate (second from last) syllable. The final syllable and 
the penultimate syllable of photography are unstressed, as is 
the syllable before the ante-penultimate syllable. They are 
therefore less salient than the antepenultimate syllable, which 
has primary stress. The penultimate syllable of kangaroo is 
unstressed and is the least salient syllable in that word. Let us 
say that the syllable in a word which receives most stress has 
primary stress, and that syllables such as the ante-penultimate 
syllable in kangaroo have secondary stress; while syllables 
which have neither primary nor secondary stress are unstressed 
syllables. We could therefore say that any given word will have 
a stress pattern : in the case of kangaroo, a final syllable with 
primary stress, preceded by a penultimate unstressed syllable, 
preceded by an antepenultimate syllable with secondary stress. 
We can informally represent primary stress by placing a 
superscript diacritic (') immediately before the start of the 
appropriate syllable, and secondary stress by using the 
subscript diacritic (,), leaving any unstressed syllables without a 
diacritic, as follows: [ ksepgo'ju:]. This notation is used in 
pronouncing dictionaries. 
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It seems clear that knowledge of the stress patterns of words 
does not normally require instruction, and children acquiring 
their native language are not normally given explicit instruction 
by their parents as to where the stresses in a word are placed. 
Having noted that, let us consider the following question: how 
does the speaker know what the stress pattern of a given word 
is? It seems reasonable to suggest that the child who is 
acquiring English simply has to memorize the stress pattern of 
each word as it is learned. After all, for any given word, the 
speaker has to memorize the sequence of phonemes which 
make up the phonemic form of the word. Whether one is a 
child acquiring English or an adult learning English, one might 
just as well memorize the stress pattern while one is at it. One 
might object that this means that an average speaker has a vast 
number of stress patterns to memorize (as many stress patterns 
as there are words in her or his vocabulary), but we know that 
human beings are very good at storing large amounts of 
information of this sort in memory. Again, we can point to the 
vast number of phonological forms which the speaker clearly 
must have in mental storage; it is, surely, not too tasking to an 
organism which has that kind of storage capacity to store the 
stress patterns of those words along with the sequence of 
phonemes which partly make up its phonological form. 

But this is not to say that there are no unconsciously stored 
generalizations governing stress patterns in English. We know 
that, for some languages, such as Modem Greek, the stress 
pattern of most words is entirely arbitrary. We also know that 
some languages have fixed stress: the stress always falls on a 
given syllable (in French, for instance, it always falls on the 
final syllable of the word, and in Polish on the second last 
syllable of the word). Let us consider some evidence in favour 
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of the idea that the native speaker of English has unconsciously 
formed generalizations concerning word stress patterns in 
English. Take the following two sets of bisyllabic English 
words (words with two syllables), all of them morphologically 
simple (i.e. containing only one morpheme: no prefixes, no 
suffixes): 

( 1 ) 


'simpi] 

simple 

[,b<em'bu:| 

bamboo 

'k h a:p3t] 

carpet 

(d3i'ja:f] 

giraffe 

'ga?(b] 

gather 

[po'jeid] 

parade 

'sA<bn] 

sudden 

[a?n'ti:k| 

antique 

’insfkt] 

insect 

[lai'eit] 

create 


The words in the first column have primary stress on the 
penultimate syllable, those in the second column have primary 
stress on the final syllable. On the face of it, it looks as if 
bisyllabic words in English may have the primary stress on 
either the final or the penultimate syllable. Perhaps it’s all 
entirely arbitrary: perhaps there are no rules. Consider, 
however, the following French words as uttered by speakers of 
French and by speakers of English with a noticeably English 
pronunciation of French (you may ignore any unfamiliar 
phonetic symbols in the French transcriptions: it’s the stress 
patterns that matter here): 


(2) 


French speaker 

English pronunciation 

manger 

‘to eat’ 

[ma'3e] 

I'mc^ei] 

chercher 

‘to look for’ 

[Jw'/e] 

['/«/«1 

bateau 

‘boat’ 

[be'to] 

['b.etao] 

franca is 

‘French’ 

[fad'se] 

ffjusei] 

lointain 

‘distant’ 

(lwe'te) 

fhvAntein] 


The word stress rule in French could not be simpler: stress 
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the final syllable of the word. So why do so many speakers of 
English have this strong tendency to stress the penultimate 
syllable in so many French words when they try to speak 
French? It’s not as if the French rule is hard to grasp. And, 
after all, English speakers have countless words in their 
language in which bisyllabic words are stressed on the final 
syllable: it’s not as if they are not in the habit of placing 
primary stress on the final syllable of bisyllabic words in their 
own language. So what is ‘typically English’ about such stress 
patterns in French bisyllabic words? 

Consider too the following trisyllabic French words and a 
typical English mis-stressing of them: 

(3) 


batiment ‘building’ 

fermeture ‘closure’ 

soigneusement ‘meticulously’ 
sortilege ‘spell’ 

consacrer ‘to devote’ 


French speaker 

[beti'md] 

[feBma'tyH] 

[swBjioz'md] 

|sDBti'k3| 

[kSsE'kBe] 


English pronunciation 
[‘baetimd] 

I'fEBmatyB] 

fswEjiozmd] 

|'soBtik3| 

['kssekBe] 


Why should there be a tendency among English speakers to 
mis-stress such trisyllabic words on the antepenultimate 
syllable? What is it that makes such a mis-stressing ‘typically 
English’? 

Finally, consider the following trisyllabic nouns in English: 

(4) 

['ebfant] elephant [pa't h eitao] potato |,k;eijgo'ru:] kangaroo 

['sinomo) cinema [ba'na:na] banana [..Ktjo^i:] refugee 

The words in the first column have primary stress on the 
antepenultimate syllable, those in the second column on the 
penultimate syllable, and those in the third column on the final 
syllable. Given that the words do not differ as to syntactic 
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category (they are all nouns) and do not differ in terms of total 
number of syllables, it looks as though there is no 
generalization concerning primary word stress in English 
trisyllabic nouns: it all looks entirely arbitrary, as if there were 
no rules. Consider, however, the following non-English 
trisyllabic nouns: 

(5) 

Gigondas moussaka Zaventem tavola 

The first of these words is French (it is the name of a town, 
and a wine, in the Southern Rhone Valley), and is stressed, like 
all French words, on the final syllable. The second is Greek, 
and is stressed by Greek speakers on the final syllable. The 
third is Dutch (it is the name of a town in Belgium, and the 
name of Brussels airport); it is stressed, in Dutch, on the 
antepenultimate syllable. The fourth is the Italian word for 
‘table’ and is stressed, in Italian, on the antepenultimate 
syllable. 

What is striking about English speakers who know little or no 
French, Greek, Dutch or Italian, and have never heard the 
words before, is that they show a very strong tendency, on first 
encountering them, to mispronounce these words by stressing 
them on the penultimate syllable, as follows: [dji'gnndos], [mu 
'so:ko], [zo'vsntom], [to'voub], If the English speaker has no 
word stress generalizations, this tendency is deeply puzzling, 
since that would mean that, given a word one has never 
encountered before (especially a foreign word), one should 
display no tendency to prefer placing the stress on any 
particular syllable. One might expect a given individual to utter 
each word variably on different occasions, with each of the 
three possible stress patterns. And even if a given speaker 
alighted, arbitrarily, on a given stress pattern and stuck to it 
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thereafter, one would expect variation from speaker to speaker. 
But this does not seem to happen: the pronunciation in which 
the second syllable is stressed is the one which they tend to opt 
for. You will probably agree, especially if you know any 
French, Dutch or Italian and have heard English speakers 
mispronouncing words in those languages, that this kind of 
pronunciation is ‘typically English’. But, once again, what is 
‘typically English’ about it? We can only answer such questions 
if there are word stress generalizations, or at least word stress 
tendencies, in English, and if we know what they are. We will 
now look at the form those generalizations take. 

8.3 English Word Stress: 

Some General Principles 

A first general principle (Principle 1: The End-Based Principle) 
is that the placement of primary stresses in English words is 
calculated by counting from the end of the word: the primary 
stress in a word will tend to fall on either the final syllable of 
the word, the penultimate (second last) syllable or the 
antepenultimate (third last) syllable (though it can fall earlier 
than that). This reflects the fact that most varieties of English 
have word stress patterns which are essentially trochaic. Recall 
that the adjective trochaic is derived from the noun trochee, 
and that a trochee is a stressed syllable (whether primary 
stressed or secondary stressed) followed by zero or more 
unstressed syllables. We will say that stressed monosyllabic 
words (such as box), words with penultimate stress (such as 
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spider and departure) and words with antepenultimate stress 
(such as cinema and America) all exhibit trochaic stress 
patterns. The rhythm of most varieties of English is trochaic: 
there is a tendency to place the rhythmic beat on the stressed 
syllables of trochaic feet. 

A second general principle (Principle 2: The Rhythmic 
Principle) is that, while it is possible for English words to end 
with as many as four unstressed syllables (as in un 
'gentlemanliness), English words cannot begin with more than 
one unstressed syllable. Principle 2 is directly related to 
Principle 1: the reason why we do not begin words with 
sequences of two or more unstressed syllables is that, if we 
place a secondary stress within such sequences, we can create 
a trochaic foot, which is desirable from the point of view of the 
rhythmic structure of English. When we derive Japanese from 
Japan, the primary stress shifts from the final syllable of Japan 
onto the final syllable of Japanese (we will examine such stress 
shifts shortly). But, having shifted the primary stress, we 
cannot leave the word Japanese with a sequence of two 
unstressed syllables preceding the primary stressed syllable: we 
must place a secondary stress on one of the two preceding 
syllables: Japanese. 

When this happens, a third general principle (Principle 3: 
The Derivational Principle) comes into play: there is a 
tendency to place the secondary stress on the syllable which 
had primary stress in the deriving word (the word which we 
are deriving the more complex word from). For instance, the 
word characterization exhibits a shift of primary stress from its 
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deriving word 'characterize (which itself is derived from 
'character). Since Principle 2 dictates that we cannot leave the 
word characterization with a series of four unstressed syllables 
prior to the syllable with primary stress, we must place a 
secondary stress somewhere in that sequence of syllables. The 
Derivational Principle says that we ought to place it on the 
syllable which had primary stress in the deriving word: 
, characteri' zation. 

However, Principle 3 may be overruled by a fourth general 
principle (Principle 4: The Stress Clash Avoidance Principle), 
which states that one should try to avoid having two adjacent 
stressed syllables. In the case of characteri zation, both 
principles are adhered to. But note that, in the case of Japa 
'nese, this is not the case: the primary stress in the deriving 
word Japan falls on the final syllable. Principle 3 dictates that 
we place the secondary stress on that syllable, but Principle 4 
dictates that we do not, since this would result in a stress clash, 
with two adjacent stressed syllables: Japanese. Cases like this 
demonstrate that, where the Derivational Principle and the 
Stress Clash Avoidance Principle come into conflict, and we 
have an option as to which syllable to place the secondary 
stress on, it is the Stress Clash Avoidance Principle 4 which 
predominates: in a word like Japanese, we place the 
secondary stress on the antepenultimate syllable in order to 
satisfy the Stress Clash Avoidance constraint. The Stress Clash 
Avoidance Principle is a strong general tendency in English. 
But there are words in English which violate that principle, 
such as the verb re'run, and the nouns chain'pagne and Dun 
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'dee. We will look at these in more depth in the following 
chapter. As far as the Derivational Principle is concerned, there 
are words such as Burmese and Chinese, which are derived 
from 'Burma and 'China. In such cases, we have no choice 
but to place the secondary stress on the penultimate syllable: 

Burmese, Chinese. But where we have a choice between 
two or more syllables, as in Japanese and Portu guese, we 
both avoid a stress clash and also create a sequence of two 
trochaic feet within the word. So Principle 4, the Stress Clash 
Avoidance Principle, often works together with Principle 2, the 
Rhythmic Principle, to create extra trochaic feet at the 
beginnings of words. Why, despite the avoidance of sequences 
of unstressed syllables at the beginning of a word, is it 
nonetheless possible to have as many as four unstressed 
syllables at the end of a word, as in gendemanliness ? Because 
they end with certain kinds of suffix. We will consider such 
suffixes shortly. For the moment, let us look further at words 
stress in morphologically simple words. 

8.4 Word Stress Assignment 
in Morphologically Simple 
Words 

English is a Germanic language which has borrowed a huge 
amount of vocabulary from Latinate languages, notably French 
and Latin, many of them with Latinate suffixes and prefixes. 
The effect of this has been to make the word stress patterns 
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more complex than they would otherwise have been, and non¬ 
native speakers will testify to the difficulty they often 
experience in trying to master the stress patterns of English 
words. Nonetheless, there is considerable regularity in English 
word stress patterns. Let us begin by considering words which 
clearly do not have prefixes or suffixes, in present-day English 
(though some of these had prefixes historically). We will 
distinguish words of a lexical category from words of a non- 
lexical category. Words of a lexical category are nouns, verbs, 
adjectives and adverbs. Words of a non-lexical category include 
prepositions, determiners (such as the, this, his), pronouns 
(such as he, her) and the conjunction and. Words of a non- 
lexical category, often referred to as function words, are not 
normally stressed. Among the words of a lexical category, 
primary stress placement may vary, depending on the syntactic 
category of the word. 

Monosyllabic words of a lexical category (such as box, run, 
big), are unproblematic: there is only one syllable for the 
primary stress to fall on. Let us therefore move on to 
morphologically simple bisyllabic words, and then proceed to 
morphologically simple polysyllabic words (words with three or 
more syllables). 

8.4.1 Morphologically Simple 
Bisyllabic Words 

8.4.1.1 Bisyllabic Nouns 

The basic pattern here is the native Germanic trochaic pattern, 
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that is, with primary stress on the penultimate syllable, as in 
Arab, bigot, carpet, district, effort, female, gremlin, harbour, 
lemon, market, native, person, rabbit, senate, turnip, vampire, 
woman. However, there is a substantial class of exceptions to 
this basic pattern, in which bisyllabic nouns have final stress. 
Among these are nouns which have been borrowed from other 
languages and are written with a double vowel letter, as in 
balloon, bazaar, canteen, harpoon, marquee, papoose, 
raccoon, shampoo, taboo, veneer. Another set of bisyllabic 
nouns stressed on the final syllable have been borrowed from 
French; they contain the French endings -ee, -ette, -ade, -elle, 
-esse, -asse, -eur, -euse, as in grandee, gazette, parade, 
gazelle, finesse, crevasse, liqueur, masseuse. Since word stress 
in French always falls on the final syllable, words such as these 
have mostly retained the French stress pattern. Other French 
bisyllabic loanwords which have retained final stress include 
hotel: [hou'Fsl], Had this word been fully nativized, it would 
be pronounced [’hootol], with the native Germanic trochaic 
stress pattern. We assume here that words such as gazette, 
parade, etc. are mostly morphologically simple in 
contemporary English. - 

8.4.1.2 Bisyllabic Adjectives 

The basic Germanic pattern is again trochaic, i.e. with stress on 
the penult, as in angry, brilliant, central, crazy, dozy, fragile, 
frigid, happy, honest, lazy, modest, narrow, orange, purple, 
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sudden, timid, urgent, yellow. The trochaic pattern can also be 
found in morphologically simple bisyllabic adjectives ending in 
-ic, as in cosmic, frantic, Nordic, static .- However, there are 
bisyllabic adjectives with final stress, such as complete, 
immense, intense, precise, select. These contain historical 
prefixes which are Latinate in origin: they come from Latin or 
French. Most of these historical prefixes no longer count as 
productive prefixes in contemporary English (they cannot be 
freely combined with roots to form new words). Large 
numbers of these words are now morphologically simple, for 
the vast majority of speakers of English speakers, in addition 
to these, bisyllabic adjectives which contain what were, 
historically, French suffixes, such as bizarre and grotesque, 
have final stress. As with nouns like gazette and parade, we 
assume here that adjectives such as bizarre and grotesque are 
morphologically simple. 

8.4.1.3 Bisyllabic Adverbs 

Once again, the basic Germanic pattern is trochaic. Many 
bisyllabic adverbs end in -ly, as in slowly and quickly. We will 
deal with those in 8.5. Other adverbs which do not end in -ly 
include rather and very. These also have a trochaic stress 
pattern. 


8.4.1.4 Bisyllabic Verbs 

The basic trochaic tendency in most varieties of English is 
much less evident in bisyllabic verbs: there are many with final 
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primary stress, as we have seen: compact, detract, deny, 
export, impose, object, permit, produce, record, subsume, 
transgress. As with adjectives, many of these historically had 
prefixes in the languages they were borrowed from: French or 
Latin (e.g. corn-pact, de-tract, ex-port, im-pose, ob-ject, re¬ 
cord, sub-sume). The tendency to avoid stressing the historical 
prefix is strong: it is striking that such bisyllabic verbs differ so 
often from their bisyllabic noun counteiparts, as in the verb- 
noun pairs con'tract (verb) but 'contract (noun), di'gest (verb) 
but 'digest (noun), pro'duee (verb) but 'produce (noun), ex 
'port (verb) but 'export (noun), com'pound (verb) but 
'compound (noun), etc., where the verb takes final stress but 
the noun takes the normal trochaic penultimate stress. (But 
there are exceptions to this pattern, as in concern where the 
noun and verb both take the verb pattern, and 'preface, where 
the noun and verb both take the noun pattern.) 

It is also worth noting that bisyllabic verbs ending in -ate will 
take primary stress on the -ate, as in create, deflate, locate, 
migrate, placate, sedate. These have to be distinguished from - 
ate words with three or more syllables (see below). Note that, 
in GA, some of these words take penultimate stress, such as 
frustrate and locate. 

There are, however, bisyllabic verbs with the basic trochaic 
stress pattern, such as argue, canter, dither, enter, equal, falter, 
gather, govern, hurry, manage, market, marry, narrow, rattle, 
sully, travel. 
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8.4.2 Morphologically Simple 
Polysyllabic Words 

8.4.2.1 Polysyllabic Nouns 

The basic Germanic pattern is trochaic. For words of more 
than two syllables, this means having primary stress on the 
antepenultimate syllable, as in academy, America, antelope, 
camera, cinema, custody, deficit, elephant, emperor, harmony, 
library, melody, paradise, quantity, strategy. However, there 
is a substantial class of exceptions to this basic pattern, in 
which polysyllabic nouns have final stress. Among these are 
nouns which have final syllables which are written with a 
double vowel letter, as in kangaroo. Another set of polysyllabic 
nouns stressed on the final syllable have been borrowed from 
French; they contain the French endings -ette, -ade, -elle, - 
esque, -eur, as in cigarette, lemonade, bagatelle, picturesque, 
connoisseur. Words such as these are mostly morphologically 
simple in contemporary English. That is, they do not really 
have a suffix: while there is a morpheme cigar, a cigarette is 
not a small cigar, for instance. - 
There is a set of nouns which have consonant clusters after 
the penultimate vowel, and these have penultimate primary 
stress, as in advantage, apartment, consensus, disaster, 
objective. There is also a set of nouns which have three or 
more syllables and which end in -ics. These too tend to have 
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penultimate stress, as in acoustics, electrics, linguistics, 
logistics, mathematics, statistics. 

Finally, there is a set of loanwords ending in a vowel which 
depart from the basic antepenultimate pattern, and take 
penultimate stress, such as banana, bikini, chapatti, chorizo, 
karate, martini, moussaka, mosquito, potato, samosa, tomato, 
volcano. 


8.4.2.2 Polysyllabic Adjectives 

Again, the basic pattern is trochaic, with antepenultimate 
stress, as in general, intelligent, juvenile, taciturn. However, 
there is a set of adjectives with a consonant cluster after the 
penultimate vowel, and these take penultimate stress, as in 
dependent, disastrous, indulgent, clandestine, momentous, 
objective, tremendous. These include words which have an 
‘rC’ cluster in rhotic accents, and which used to have an ‘rC’ 
cluster in non-rhotic accents, as in enormous, maternal. 

Polysyllabic adjectives ending in -ate have antepenultimate 
primary stress, as in deliberate, elaborate, fortunate, 
inadequate, legitimate. These are therefore unlike bisyllabic 
verbs ending in -ate, which have primary stress on the -ate 
(but American speakers have penultimate primary stress in 
some of these words, as we have seen). 

8.4.23 Polysyllabic Verbs 

As with bisyllabic verbs, polysyllabic verbs often flout the basic 
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trochaic English word stress pattern: there are many verbs with 
three or more syllables which have final primary stress, as in 
entertain, intervene, intersect. As with bisyllabic verbs, many 
of these have etymological prefixes which are Latinate in origin 
(such as enter- and inter-). 

Unlike bisyllabic verbs ending in -ate, polysyllabic verbs 
ending in -ate follow the basic antepenultimate primary stress 
pattern, as co-ordinate, deliberate, elaborate, investigate, 
originate. These verbs thus have the same stress pattern as 
polysyllabic adjectives ending in -ate. 

We have now identified four factors which may play a part 
in word stress assignment in morphologically simple words in 
English. Firstly, the syntactic category of the word may play a 
role: we saw, for instance, that many verbs depart from the 
basic trochaic pattern. Secondly, we saw that the presence of 
historical (etymological) prefixes of a Latinate origin can affect 
the stress pattern of a word: we saw that Latinate affixes such 
as ex-, pro-, inter-, although they are mostly no longer 
productive prefixes in present-day English, typically fail to take 
primary stress. Thirdly, we saw that spelling plays a role in 
word stress. For instance, words such as shampoo, papoose 
and kangaroo all exhibit primary stress on a final syllable 
containing a double vowel letter. Connected with this is a 
fourth factor, i.e. the presence of loanwords in English: words 
such a shampoo and hotel have retained the stress pattern of 
the language they were borrowed from. Note too that both 
loanwords and the existence of Latinate affixes are at work in 
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the stress patterns of words such as gazelle, grotesque, 
cigarette, picturesque and connoisseur. 

We may now return to the stress patterns we considered in 
8.2. Recall that we noted that English speakers tend to mis- 
stress bisyllabic French words such as manger (‘to eat’) and 
bateau (‘boat’) by placing the stress on the penultimate 
syllable. We can now see the reason for this: it is because that 
is the basic pattern for bisyllabic words in English. We also 
noted that English speakers tend to place primary words stress 
on the antepenultimate syllable of polysyllabic French words 
such as fermeture (‘closure’) and batiment (‘building’). The 
reason is that this is the basic pattern for English polysyllabic 
words. That is what is typically English about those mis- 
stressings. In the pronunciation of polysyllabic words such as 
Gigondas and Zaventem, the spelling plays a role: the existence 
of two consonant letters in the spelling of the penultimate 
syllable of the word in English often shifts the stress away 
from the basic antepenultimate stress pattern to a penultimate 
stress pattern. Finally, polysyllabic words such as moussaka 
and tavola fall within the class of loanwords ending in a vowel 
which exceptionally take penultimate primary stress. 

Let us now consider word stress in morphologically complex 
words. 


8.5 Word Stress Assignment 
and Morphological Structure 
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English has both suffixes (as in sorted, with the -ed suffix) and 
prefixes (as in indirect, with the prefix in-). Let us begin with 
suffixes. These may be subdivided into inflectional and 
derivational suffixes. The addition of an inflectional suffix is 
often said to produce ‘a different form’ of the word one would 
have if the suffix had not been added. For instance, when the 
suffix -ing is added to the verb obscure, the resulting word, 
obscuring (as in He’s obscuring the issue), is said to be a form 
of that verb; when the plural suffix is added to the noun 
tractor, the resulting word tractors is a form of that noun. But 
when a derivational suffix is added to a word, it is said to 
produce not a different form of the same word, but another 
word. Thus, when the suffix -ly is added to an adjective, say 
bold, the result, the adverb boldly, is a distinct word. Similarly, 
when the suffix -ness is added to an adjective, as in boldness, 
the result is a distinct word. Other examples of derivational 
suffixes in English are -ity (as in personal/personality), -ee (as 
in divorce/divorcee), -al (as in person/personal), -ian (as in 
Wagner/Wagnerian), -ic (as in atom/atomic), -ish (as in 
green/greenish), -y (as in sleep/sleepy), etc. Inflectional 
suffixes are not stressed, and have no effect on word stress in 
English words, as can be seen from pairs such as 'refuge! 
'refuges (plural suffix), ' comment/'commented (past tense 
suffix), develop!' developing, (progressive suffix), and 
' varnish/' varnishes (present tense suffix). 

Among the English derivational suffixes, some have no effect 
on stress when added to a word, while others do affect the 
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stress pattern. These two classes of suffix are referred to as the 
stress-neutral and stress-shifting suffixes, respectively. The 
stress-shifting suffixes are all of Latinate descent, i.e. from 
French or Latin. But not all Latinate suffixes are stress-shifting. 
Let us begin by considering stress-neutral suffixes. 

8.5.1 Stress-Neutral Suffixes 

English is a Germanic language; all native Germanic suffixes 
are stress-neutral. These include the adverbial suffix -ly, as in 
the words ' brightly , ' deeply, 'dimly, 'madly, 'quickly, 'slowly, 
'truly, etc. We can see here that the stress does not shift when 
the suffix is added to the monosyllabic adjective from which 
the adverb is derived {{bright, 'deep, 'dim, etc.). The stress 
pattern of these adverbs is therefore the basic trochaic pattern 
for English bisyllabic words: stress on the penultimate syllable. 
Polysyllabic adverbs ending in -ly include 'cheerily, 'happily, 
in'credibly, re markably, tre'mendously, 'wearily. Again, the 
primary stress remains where it was in the adjective which the 
adverb is derived from: 'cheery, 'happy, in'credible, re 
'markable, tre mendous, 'weary. In all of these cases, the 
resulting adverb is a polysyllabic word with the basic trochaic 
pattern for polysyllabic words, i.e. antepenultimate stress. 
However, the primary stress will fall earlier than the 
antepenultimate syllable if the deriving adjective has primary 
stress on its antepenultimate syllable or earlier, as in 
'comfortably, derived from 'comfortable which has three 
syllables in RP, but four in GA. Note that adjectives such as 
comfortable and idle, which have a syllabic /!/, as in [' aidl], 
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will lose a syllable when -ly is added: idly is pronounced 
[' aidli]. 

The native Germanic adjectival suffixes such as -er, -est, - 
ish, -ful, -less, -y also have no effect on stress, so when they 
are added to monosyllabic roots, the result is a trochaic stress 
pattern, as in ' green/' greener, ' green/' greenest, 'green/ 
'greenish, ' hope/' hopeful, ' mind/' mindless, ' slime/' slimy. 
When these stress-neutral suffixes are added to bisyllabic 
words with the basic trochaic stress pattern, the result is an 
antepenultimate stress pattern, as in 'heavy/'heavier, 'heaviest, 
'heavy ish, ' penny/' penniless, ' pity/' pitiful, 'summer/ 
'summery. Once again, in cases where a syllabic consonant is 
possible in the deriving word (e.g. bubble: [bAbl]), a syllable is 
lost when -y is added, as in 'bubble/'bubbly, 'crumble! 
'crumbly, 'purple/'purply, 'winter/'wintry, 'wriggle/'wriggly. 
The result is therefore a bisyllabic adjective with penultimate 
stress. 

In addition to the -er suffix which marks the comparative 
form of adjectives (as in greener), there is a stress-neutral 
native Germanic -er suffix which can be used to form nouns 
from verbs, as in 'advertise/'advertiser, 'love/'lover, 'make/ 
'maker, pre'tend/pre'tender, ' publish/'publisher, 'sing/'singer. 
We can see from these examples that the addition of this -er 
suffix has no effect on stress. 

Other native Germanic suffixes, used to form nouns, are - 
ess, -hood, -ism, -ist, -ness, and -ship, as in ' priest/'priestess, 
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' shepherd /' shepherdess, ' child/' childhood, ' adult/' adulthood,- 
' Marx/' Marxism, 'Thatcher/'Thatcherism, ' Marx/'Marxist, 

' Union/' Unionist, 'capital/'capitalist, 'kind/'kindness, 'gentle/ 
'gentleness, 'friend/'friendship, a' pprentice/a 'pprenticeship. 
Once again, the addition of the suffix does not shift the stress. 

There are Latinate suffixes which fail to shift stress. Among 
these are the adjective-forming bisyllabic suffix -able, 
pronounced [obi], as in de'batable, de'pendable, 'doable, per 
'suadable, 'sellable. We can see that, when -able is added to 
the morphemes de'bate, de'pend, 'do, persuade, 'sell the 
stress does not shift. The result, in these cases, is an adjective 
with antepenultimate primary stress. If -able is added to a 
word which does not have primary stress on the final syllable, 
such as 'argue, 'manage, 'market or 'perish the result is a 
word with primary stress placed prior to the ante-penultimate 
syllable: 'arguably, 'manageable, 'marketable, 'perishable. 

8.5.2 Stress-Shifting Suffixes 

Among the stress-shifting derivational suffixes, we may 
distinguish between those on which the primary stress falls, 
and those which shift the stress within the base form to which 
the suffix is attached. Let us begin with suffixes on which the 
primary stress falls. 

The suffixes -ee, -eer and -ese all take primary stress, as in 
em'ploy/ emplo yee 'mountain/ mountai'neer and com 
'puter/com : pute'rese). Notice that, in polysyllabic words 
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containing one of these suffixes, we cannot leave the derived 
word with a sequence of unstressed syllables at the beginning: 
a secondary stress must occur. This is the Rhythmic Principle 
introduced in 8.3. The Derivational Principle and the Stress 
Clash Avoidance Principle may also play a role here. Take the 
pair computer/computerese : the -ese suffix takes primary 
stress, but the Rhythmic Principle says that we cannot leave 
the resulting word with a sequence of unstressed syllables at 
the beginning of the word. The Derivational Principle says to 
put that secondary stress on the syllable which had primary 
stress in the deriving word, in this case on the syllable which is 
the final syllable of the verb compute. Thus the stress pattern 
conipute rese. There is no violation here of the Stress Clash 
Avoidance Principle, since the secondary and primary stresses 
do not fall on adjacent syllables. Recall, however, that in a 
word such as Japanese, there is a clash between the 
Derivational Principle and the Stress Clash Avoidance 
Principle. Once we have placed primary stress on the -ese 
suffix, the Rhythmic Principle insists on a secondary stress. 
The Derivational Principle says that this secondary stress 
should fall on the syllable containing the primary stress in the 
deriving word Japan. But if we were to place the secondary 
stress there, this would violate the Stress Clash Avoidance 
Principle. As we have seen, when there is a conflict between 
those two principles, it is the Stress Clash Avoidance Principle 
which predominates. Thus the stress pattern Japanese. The 
same situation arises for words such as emplo'yee in which 
the secondary stress does not fall where the primary stress falls 
on the word em 'ploy. 
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We have argued that many words borrowed from French 
which have an -ette ending in the spelling, such as etiquette 
and gazette, are not really morphologically complex: it is hard 
to argue that they contain the morphemes etiq- and gaz-. 
However, we allowed that there are some -ette words which 
are clearly morphologically complex, as in the word 
kitchenette. Our grounds for doing so were that there is clearly 
a moipheme kitchen in a word like this, and a kitchenette is 
indeed a small kitchen. Given that we have allowed this, we 
can say that, at least in some words, there is an -ette suffix 
which takes primary stress, and is thus parallel to -ee, -eer and 
-ese. We saw that, where -ette has a clear meaning, it can 
mean either ‘little’ (as in kitchenette and sermonette ) or 
‘female’ (as in the word ladette). There are some -ette words 
in which it is not entirely clear whether there is a proper -ette 
suffix or not. While it might seem reasonable to say that an 
usherette is a female usher, it is not clear that a maisonette is a 
small maison (the French word for ‘house’), although it’s 
certainly a small house. 

Examples of stress-shifting suffixes which do not themselves 
take primary stress are -ity and -ic, as in 'personal/ { perso 
' nality, , inch' vidual /, individu' ality, ' atom/a ' tomic, 

'monarch/mo'narchic. In each case, the primary stress falls on 
the syllable immediately preceding the stress-shifting suffix. 

Further examples of suffixes which shift stress are -ous (as in 
ad'vantage! advan'tageous) and -ions (as in 'injure/in 



'jurious). Again, the primary stress falls on the syllable 
immediately preceding the stress-shifting suffix. 

You may already have noted that, when the stress in a word 
shifts as a result of the addition of a stress-shifting suffix, this 
can have the effect of changing the pronunciation of the vowel 
in the affected syllable. Thus, in personal, the final syllable, in 
the -al suffix, has a schwa vowel ([o]) while, in personality, 
the antepenultimate syllable, again in the -al suffix, has an [as] 
vowel, since that syllable bears primary stress in personality. 
Consonantal changes in the base form may also occur when 
suffixes are added. For instance, when -ity is added to the 
adjective opaque, the resulting form opacity has an [s], rather 

than a [k], at the end of the base.- Such changes are not 
limited to stress-shifting suffixes. For instance, when the suffix 
-y is added to president, the resulting form, presidency, has an 
[s], rather than a [t], at the end of the base. Clearly, the more 
variation there is, in terms of both stress pattern and consonant 
and vowel realization, between base form and affixed form, the 
less evident it will be that the base and affixed forms are 
actually related: while the relationship between, say, bold and 
boldness (with no stress, vowel or consonant changes) is 
transparent, that between, say, opaque and opacity is much 
less so. It is for this reason that many people prefer to add 
affixes of the -ness sort (which tend to be native to the 
Germanic family of languages to which English belongs, unlike 
the -ity sort, which have their roots in the Latinate languages) 
to base forms. 

A striking property of unstressed syllables in English is that 
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they often contain the schwa vowel, transcribed as [o], which 
is ‘less distinct’ perceptually than most other vowels. It is a 
common feature of related words such as photograph and 
photography that, when the stress shifts from the first to the 
second syllable by virtue of the addition of the -y suffix, the 
vowel in that syllable changes from being a ‘full’ vowel to 
being a ‘reduced’ schwa. We will look at this kind of 
phenomenon in a little more detail in the following chapter. 

8.5.3 Word Stress Patterns and 
Prefixes 

Let us now turn to the stressing of prefixes. We will take the 
view that most separable monosyllabic prefixes bear secondary 
stress. By ‘separable’, we mean that, if the prefix is removed, 
we are left with an existing English word, as in the verbs re¬ 
allocate , re-'fabricate, re-'run, re'skill, re'spray. Other 
monosyllabic prefixes include: 
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co- (‘together’) 

tic- (‘get rid of/reverse’) 

dis- (negative) 

ex- (‘former’) 

in- (negative) 


null- (‘badly’) 
mis- (‘wrongly’) 
pre- (‘before’) 
pro- (‘in favour of’) 
re- (‘again’) 
sub- (‘beneath’) 
t rat is- (‘across’) 
tin- (negative) 


as in ,co-cott'spirator, ,co-'edit 

as in ,de-regulation, ,de-‘louse 

as in , disappear, ,dis‘pleasure 

as in ,ex-ad'ministrator, ,ex-'boss, ,ex- serviceman 

as in Jnco'rrect, ,in'active (see too //-, as in , illegality, 

,illiterate, im-, as in , imperturbable, ,im'proper, and -ir, 

as in , irresistible, frregular) 

as in ,mala 1 djusted, ,ma'lodorous 

as in ,mis-a'ddressed, ,rnis-‘spelled 

as in ,pre-e'xist, ,pre-pay 

as in j>ro-hunting, ,pro-'choice, ,pro-'life 

as in je-a'ppear, ,re-‘fill (verb) 

as in ,sub-a'tomic, ,sub-human 

as in ,trans-At‘lantic, ,tran‘sexual 

as in ,una‘ttractive, ,tin'fair 


It is striking that some of these can be used as independent 
words, as in ‘I’m having dinner with my ex tonight’ and ‘Are 
you with the pros or the antis?’ 

Bisyllabic prefixes can form a trochaic foot, and so, in 
accordance with the Rhythmic Principle, will have the 
penultimate syllable of the prefix bearing secondary stress, as 
in antiabortion, antibacterial, anti catholic, anti' choice, 
anticli'mactic, anti-in flammatory, etc. While these all have 
secondary stress, there are some cases where there is primary 
stress on bisyllabic prefixes, as in anti-hero and antimatter. It 
is perhaps wise to consider these latter cases as compounds 
(words made from two or more words), to which we turn 
shortly. Equally, while mega- can have secondary stress, as in 
mega lithic, there are clear cases where it has primary stress, 
as in 'megabyte 'megadeath and 'megaphone. These too may 
be used as independent words, as in ‘The antis are out in 
force’ or ‘That film was absolutely mega!’ Other bisyllabic 
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prefixes are extra-, as in extra-marital, extra-terrestrial, and 
super-, as in super-abundant, super-human. 

Prefixes such as those just listed are of Latinate origin: they 
occur in words which have been borrowed directly from Latin, 
or from words borrowed from French, which is derived 
historically from Latin. As we have seen, there are a good 
many bisyllabic noun-verb pairs in which the verb is stressed 
on the final syllable, whereas the noun is stressed on the 
Latinate prefix. Examples are dis'charge (verb) vs 'discharge 
(noun), ex'port (verb) vs 'export (noun), re' search (verb) vs 
'research (noun). It is not clear that such words are still 
morphologically complex in contemporary English: for 
example, one cannot separate the ex- of export and arrive at a 
verb port. But these words certainly have elements which are, 
etymologically, prefixes. There are exceptions to the 
generalization that the verbs in such pairs are stressed on the 
final syllable, and the nouns on the etymological prefix: some 
pairs, such as debate, rebuke and supply, have the verbal 
pattern, while others, such as combat, invoice and preface, 
have the noun pattern. However, the differential verb-vs-noun 
stress pattern still seems to be productive in contemporary 
English. This can be seen from neologisms. Take the verb in 
'vite: it conforms to the pattern. The noun invitation has 
existed for some time, but a more informal term has been 
coined: an 'invite. Interestingly, when the new noun was 
derived from the existing verb, the stress shifted to conform 
with the noun pattern, suggesting that speakers have access to 
the noun-vs-verb stress patterns. 
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8.6 Compound Words 

Compound words are, put simply, words which can be 
analysed as consisting of two (or more) words, rather than as 
containing a base and an affix. For instance, while mole-hill is 
a compound, boldness is not (the form -ness is a suffix, not a 
word). We will focus on two-part compounds here. The 
Compound Stress Rule in English says, of two-part 
compounds: of the two elements, the first is the most 
prominent. Two-word compounds thus have the opposite 
pattern to two-word phrases, such as the noun phrase black 
bird (a bird which is black), the adjective phrase very tall, verb 
phrases such as kissed Mary, adverb phrases such as very 
slowly and prepositional phrases such as into London. These 
all exhibit the English Phrasal Stress Rule (which we will return 
to in chapter 9): in all of these, it is the second element which 
is most prominent. Examples of compounds which have the 
regular compound stress pattern, with the first element the 
most prominent, are: atom-bomb, backdrop, blackbird, car¬ 
park, classroom, comeback, corkscrew, darkroom, dragonfly, 
filing cabinet, flower-bed, flowerpot, grammar school, 
handshake, high-school, make-up, place-name, social life, sex 
life, snowstorm, steamboat, textbook, woodpecker. 

How can we tell whether a given two-word sequence is a 
compound or a phrase? When the two parts are written as one 
word (e.g. flowerpot) or with a hyphen (e.g. atom-bomb ), it is 
easy to see that one is dealing with a compound. But if the two 
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parts are written as separate words (e.g. grammar school ), it is 
less easy. (There is also variation in how compounds are 
written: one may find, for example, the written forms textbook, 
text-book and text book.) In many cases, compounds have a 
different kind of semantics (meaning) from phrases. Take the 
phrases black bird, dark room and green house. Compare their 
meanings with the compounds blackbird, darkroom and 
greenhouse. While all (male) blackbirds are black birds, not all 
black birds (phrase) are blackbirds (compound): ravens, 
jackdaws and cormorants are black birds (phrase), but they are 
not blackbirds (compound). While all darkrooms are rooms 
which are normally dark, not all dark rooms are darkrooms 
(places for developing photographs): if I close the shutters and 
switch off the lights in my study, it becomes a dark room 
(phrase), but not a darkroom (compound). A green house 
(phrase) is a house which is painted green, but a greenhouse 
(compound) is not a house, and may be painted white. It 
seems likely that, in the history of English, compounds started 
off as phrases: a (male) blackbird is indeed black, a darkroom 
is indeed normally dark, and a greenhouse is a house-shaped 
structure where one grows green things. But such phrases have 
made the transition to becoming single words. 

8.6.1 Exceptions to the Compound 
Stress Rule 

(a) two-part place-names, such as Botany Bay, 

Buckingham Palace, East Anglia, Los Angeles, Mount 
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Everest, New York, Niagara Falls, San Francisco. These 
include street names, e.g. Blackberry Way, Fifth Avenue, 
London Road, Mornington Crescent, Mulholland Drive, 
Penny Lane, Peyton Place, Trafalgar Square. The only 
exception to this are large numbers of street names ending 
in the word street: in my home town of Edinburgh, 
London Road (where Road is most prominent) is not far 
from London Street (where London is most prominent). 

There are sets of compounds which systematically violate 
the Compound Stress Rule. We now list these. 

(b) compounds with a participial second element, ending 
in -ed, -en or -ing. The -ed compounds are more common 
than the latter. Many of these are based on parts of the 
body, and some of these are more metaphorical than 
others. Examples are: bare-faced, big-'eared, big- 
' headed, broken-hearted, cack-' handed, dim-witted, 
double-'jointed, empty-'headed, even-'handed, even- 
' tempered, faint-'hearted, fair-'haired, far-'sighted, 
flat-'chested, flat-'footed, fleet-'footed, foul-'mouthed, 

. good-' natured, hard-'nosed, high-'pitched, hot- 
' headed, . ill-di' sposed, . ill-' tempered, . kind-' hearted, 
left-'handed, level-'headed, lily-'livered, limp- 

'wristed, long-'haired, long-'legged, long-'winded, 
narrow-'shouldered, old-fashioned, one-'eyed, one- 
' legged, .pig-' headed, red-'handed, red-'headed, 
.squeaky-'voiced, .strong-'minded, .strong-'willed, thin- 
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'lipped, weak-kneed, weak-willed, wrong-footed, 
wrong-headed, clean-shaven, long-'proven, ,soft- 
'spoken, clear-'thinking, far-'reaching, ,good-'looking. 

(c) compounds in which the first part expresses what the 
object is made from. Examples are: apple pie, brick wall, 
cotton socks, ham sandwich, iron filings, paper napkin, 
pork pie, olive oil. Notice that these are distinct from 
similar compounds in which the first part does not express 
what the object is made from, as in paper clip, which is 
not made from paper, cotton reel, which is not made from 
cotton, and olive tree, which is not made of olives. Note 
that, in American English, the first element can be the 
most prominent in such compounds. 

(d) compounds in which some kind of concrete or abstract 
positioning is involved. Examples are: April showers, 
Christmas break, evening meal, middle-class, second- 
rate, winter holiday. 

(e) compounds which are two-part colour words. 
Examples are: dark-green, deep-yellow, light-green, pale- 
blue. 

(f) compounds derived from phrasal verbs. Examples are 
compound nouns derived from phrasal verbs: chucker-out 
(from chuck out), hanger-on (from hang on), passer-by 
(from pass by), washing-up (from wash up)', and 
compound adjectives derived from phrasal verbs: finished- 
off (from finish-off), knocked-out, pared-down (from pare 
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down), rolled-up (from roll up), tired-out, wiped-out, 
wrapped-up (from wrap up). 


8.7 Summing Up 

We have covered a fair amount of detail in this chapter. It may 
prove helpful to the reader to have the main points summarized 
here, ignoring many details and exceptions: if the reader can 
grasp the take-home message for each of the points covered in 
this chapter, the details can then be mastered by consulting 
each section of the chapter. 

• English word stress is not random. 

• English rhythm is trochaic, as in woman and battery. 

• Primary stress is calculated from the end of the word, 
not the beginning. 

• English words cannot begin with more than one 
unstressed syllable. 

• When one English word is derived from another, and the 
primary stress shifts as a result of the derivation, there is a 
tendency to place the secondary stress on the syllable that 
had primary stress in the deriving word, as in , characteri 
'zation. 

• There is a tendency to avoid placing primary and 
secondary stresses next to each other, as in Japanese. 

• While English nouns, adjectives and adverbs mainly 
follow the basic trochaic pattern, there are many verbs 
which do not. 

• English suffixes may be divided into those which affect 
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the primary stress (such as -ese and -ity) and those which 
do not (such as -ness). 

• Among the suffixes which affect primary stress, some 
take the primary stress (such as -ese), while others do not 
(such as -ity). 

• Separable prefixes normally take secondary stress, as in 
, pre- 'pay. 

•The basic pattern for two-part compounds is: the first 
element is the most prominent, as in darkroom. 


Notes 

1 To the extent that any such endings may be productive, and 
thus may result in neologisms, then the resulting words will 
indeed be morphologically complex. For instance, the fairly 
recently coined word ladette (in Britain, a female lad: a 
young woman who behaves in a loud, foul-mouthed, heavy¬ 
drinking, sexually promiscuous manner) is morphologically 
complex (here -ette means ‘female’, rather than ‘little’). 

2 It might be argued that words such as cosmic are 
morphologically complex, given the existence of the word 
cosmos. We take the view that cosm- does not constitute a 
morpheme in contemporary English, and that words such as 
cosmos and cosmic are therefore morphologically simple. 

3 There is, however, a case for saying that words such as 
kitchenette are morphologically complex, since kitchen is 
clearly a morpheme, and a kitchenette is indeed a small 
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kitchen. 

4 Note that adult is typically stressed on the final syllable in 
American English. 

5 The word opacity also exhibits an alternation, between [ei] 
and [se], in the stressed vowel of the base. This is one of a 
set of vowel alternations which we do not examine here. 

6 The expression bouquet garni (a bunch or sachet of herbs 
used in cooking) is pronounced with primary stress on the 
penultimate, rather than the final, syllable of bouquet. This is 
a result of the process of iambic reversal, described later in 
our discussion of metrical structure. 


Exercises 

Listen to sound 
files online 

1 Listen to Track 8.1 at www.wilev.com/go/carrphonetics . 
For each of the bisyllabic words on the recording, say 
which ones have the default trochaic stress pattern for 
primary stress. For those which deviate from that pattern, 
explain why. The words are: 

(a) famine 

(b) Maltese 

(c) migrate 

(d) trainee 

(e) winter 

(f) explain 

(g) silly 
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(h) compact (noun) 

(i) compact (verb/adj.) 

(j) export (noun) 

(k) export (verb) 

(l) stumble 

(m) fancy 

(n) differ 

(o) taboo 

(p) gazette 

(q) arcade 

(r) burlesque 

2 Listen to Track 8.2 . For each of the sets of polysyllabic 
words on the recording, say which have the default 
Germanic trochaic pattern. For those which deviate from 
the default pattern, explain why. The sets are as follows: 

(a) factory 
America 
family 
academy 
stimulus 

(b) develop 
inherit 
complicit 
explicit 
inherit 

(c) kangaroo 
employee 
engineer 
seventeen 
mountaineer 
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(d) interrogate 
investigate 
accommodate 
demonstrate 
co-ordinate 

(e) mathematics 
physics 
periodic 
linguistics 
alcoholic 

(f) autumnal 
sentimental 
orchestral 
horizontal 
universal 

(g) hostility 
austerity 
modernity 
humility 
ambiguity 

(h) banana 
bikini 
karate 
martini 
piano 

(i) adventure 
amalgam 
consensus 
November 
advantage 

(j) momentary 
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secretary 

literary 

laboratory 

military 

3 Listen to Track 8.3 . Each of the nouns on the recording 
(all of them loanwords into English) has a stress pattern 
which deviates from the default Germanic pattern. In each 
case, say in what sense that stress pattern is exceptional 
with respect to the rules of word stress assignment in 
English. Primary and secondary stresses are marked as 
follows: [.hou'tel] {hotel, with primary stress on the final 
syllable and secondary stress on the penult). Where the 
GA and RP pronunciations differ, this is indicated, as in 
[.hou'tslj/^hou'tsl], which gives the RP pronunciation 
followed by the GA pronunciation. 

(a) 

hotel 

([ hoo'tsl]/[ hou'tel], not ['houtslj/f'houtsl]) 

(b) 

bouquet 

([ bu'k h ei], not ['bu:kei])- 

(cj 

bamboo 

([ basin'bu:], not [bsembu:]) 

(d) 

champagne 

([ Jasm'p h ein], not ['Jaempem]) 

(e) 

bikini 


201 



([bi'k h i:ni], not [ bik h ini]) 

(f) 

martini 

([mn:'t h i:ni]/[mm't h i:ni], not ['mo:tini]/['mratini]) 

(g) 

chorizo 

([tjo'ikzou], not ['tfcuizou]) 

0 

4 For non-native speakers: listen to Track 8.4 and repeat 
each utterance. For native speakers and non-native 
speakers alike: indicate primary stresses and secondary 
stresses on the transcriptions of those utterances, given 
below. For example: 

Mary finds Bill’s book uninterpretable, 'meoii 'faindz 
'bilz 'buk Anin't3:pj3tobl: 

(a) 

Mathematics is incredibly difficult. mseOomsetiks iz 
irjkuedibli difikolt 

(b) 

My car was made in America, mai kn: was meid in 
omsnko 

(c) 

His computer is Japanese, hiz kompju:touz d 3 seponi:z 

(d) 

Academic conversation is dull, sckodemik konvoseijn 
iz dvl 

(e) 

The pohce will interrogate the detainees, do polks wil 
inteiogeit do diteinkz 

(f) 

They don’t produce many exports, dei dount p.iodju:s 
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msni ekspo:ts 

(g) 

They don’t export much produce, dei daunt ikspo:t 
mxtj pjodju:s 

(h) 

I found the film rather sentimental, ai faund do film 
jd:6o ssntimsntl 

0 

5 For non-native speakers: listen to Track 8.5 and repeat 
the words you hear. For native speakers and non-native 
speakers alike: explain the word stress patterns of the 
following groups of English bisyllabic words, as heard on 
the recording. 

(a) happen 
woman 
fancy 
echo 
father 

(b) deny 
inspect 
comply 
expand 
inflect 

(c) trainee 
bamboo 
bazaar 
taboo 
shampoo 

(d) create 
migrate 
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locate 

frustrate 

narrate 

(e) produce (verb) 
export (verb) 
discharge (verb) 
object (noun) 
contract (noun) 

(f) produce (noun) 
export (noun) 
discharge (verb) 
object (noun) 
contract (noun) 



6 Listen to Track 8.6 . Explain where the metrical foot 
boundaries fall in the following sentences, as heard on the 
recording. (Draw vertical lines where the foot boundaries 
fall.) 

(a) Leave me alone! 

(b) Leave me a slice! 

(c) She left in a hurry. 

(d) She lives in America. 

(e) Put it in the refrigerator! 

(f) John’s a modern metrosexual. 

(g) Clinton opposes militaristic solutions. 

(h) Aude is a flexitarian. 
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Rhythm, Reversal and 
Reduction 

9.1 More on the Trochaic 
Metrical Foot 

We said, in chapter 8, that the rhythm of English is trochaic. 
the basic rhythmic pattern consists of a stressed syllable 
followed by zero or more unstressed syllables. For instance, in 
the phrase made in a factory, the metrical structure is 
['meidino'fsektoji]. The two trochaic feet here are ['mcidino] 
and ['fsektoji]. We assumed too that syllables with secondary 
stress also form trochaic metrical feet, as in the word 
academic, [.seko'dsmik]. The two trochaic metrical feet here 
are [,seko] and [dsmik]: the secondary stress in [,seko] forms 
a trochaic metrical foot with the following unstressed syllable, 
and the primary stress in ['dsmik] forms a trochaic metrical 
foot with the following unstressed syllable. 

But what is the evidence for the metrical foot? And what 
evidence is there for our claim that all feet in English are 
trochaic? We will now address these questions. 
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9.1.1 Evidence for the Trochaic 
Metrical Foot (a): Rhyming 

Although we have identified a constituent within the syllable 
widely known as the rhyme, the term is a misnomer: this 
constituent is not the unit on which rhyming in English is 
based. While it is true that bad rhymes with mad, and that both 
contain the rhyme [sed], we must not be misled into thinking 
that two words rhyme only if they have identical rhyme 
constituents in the syllable, in this case [sed]. Consider the 
words witty and city, they rhyme because they both have a 
trochaic metrical foot of the same sort: ['witi] and [' siti]. 
Clearly, onset consonants play no role in rhyming, but metrical 
structure does, and the rhyme constituent does not. The reason 
why entity does not rhyme with either witty or city is that the 
metrical foot structure of entity is [entiti]: the word entity does 
not contain a metrical foot of the shape [' iti]. Rhyming in 
English is based on identity of stressed vowels in two or more 
trochaic metrical feet, and identity of all subsequent phonetic 
segments.- The word entity does not have the same stressed 
vowel as the words witty and city. 

Similarly, the phrase phone ya {phone you), which is 
pronounced [’founje] in RP and ['founje] in GA, rhymes with 
pneumonia: [njufmounje] in RP and [nufmounje] in GA. In 
this case, the rhyme is based on trochaic metrical feet which do 
not map directly onto word boundaries: ['founje] contains two 
words, while ['maunje] is part of a word. 
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9.1.2 Evidence for the Trochaic 
Metrical Foot (b): Expletive 
Insertion 

Expletives such as bloody and jacking are frequently used in 
informal spoken English by many speakers. Each consists of a 
standard Germanic bisyllabic trochaic metrical foot: [' bLvdi] 
and [fvkiq] (also pronounced with a final [n] by many 
speakers of English, which may be syllabic: ['fAkn]). 

The patterns of use of words such as these in syntactic 
structure is, contrary to popular belief, quite complex. We will 
ignore that syntactic complexity and focus here on the fact that 
they can be inserted into the internal structure of words, as in 
cibso-bloody-lutely, where bloody is inserted into the word 
absolutely. This word is an adverb derived from the adjective 
absolute, which has the basic antepenulti-mate word stress 
pattern of polysyllabic words in English: [' sebsolu:t]. There is, 
however, an emphatic pronunciation with final primary stress 
and a secondary stress on the antepenult: [, sebss lu:t]. It is in 
thi s emphatic form that the expletive can be inserted into, as in: 

A: Do you like Amy Winehouse? 

B: Abso-bloody- lately! 

If you are a native speaker of English (and some non-native 
speakers will see this too), you will agree that it is not possible 
to reply ab-bloody-solutely, or absolute-bloody-ly. But why 
not? The answer lies in the existence of the trochaic metrical 
foot. The expletives are bisyllabic trochaic metrical feet. The 
words into which they can be inserted will contain trochaic 
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metrical feet. In inserting expletives, one must respect the 
trochaic metrical structure of the word one is inserting an 
expletive into: abso-bloody-lutely contains a sequence of three 
trochaic bisyllabic metrical feet. It respects the trochaic 
bisyllabic metrical foot structure of absolute, inserting another 
such foot in between those two. 

9.1.3 Evidence for the Trochaic 
Metrical Foot (c): Neologisms 

Many neologisms are based on the trochaic metrical foot. Here 
are some examples: alcoholic, workaholic, shopaholic, 
sexaholic, chocoholic. All of these neologisms are based on 
analogy with one or more previously existing words. If we 
assume that work-aholic, shopaholic, sexaholic, chocoholic 
are all based on an analogy with the word alcoholic, then we 
must ask what the basis of the analogy might be. In this case, it 
cannot be the morphological structure of the word alcoholic, 
which contains the root alcohol and the suffix -ic (which, as 
we have seen, shifts the stress from the word alcohol, 
pronounced [’selkohnl]). If the morphology were the basis for 
the analogy, then the neologisms would be workic, shopic, 
sexic and chocic. 

The analogy in these cases is based on the metrical structure 
of the word alcoholic ([,selko'hnlik]), which contains two 
bisyllabic trochaic metrical feet: [,aelko] and [’hnlik]. The 
reason why the forms work, shop, sex and choc have to be 


208 



written with an <a> is that this letter represents a schwa vowel 
([o]) which is present in the first trochaic metrical foot of 
alcoholic, on which the analogy is based. Native speakers of 
English produce such neologisms because they have a (perhaps 
not entirely conscious) sense of the metrical structure of their 
native language. 

Consider contemporary neologisms such as flexitarian (a 
vegetarian who is prepared to be flexible, and eat meat from 
time to time) and pescitarian (someone who doesn’t eat meat, 
but who does eat fish, or seafood in general). These words 
have been formed by analogy with the word vegetarian, 
pronounced [,vsdjo'tcoaion] in RP. The two trochaic metrical 
feet in this word are [, vedjo] and [ teorion]. It is this metrical 
structure that drives the analogical process which results in 
flexitarian and pescitarian: [.flsksi'teoiion] and [ peksi 
'teorion]. 2 

Consider too the recent neologism metrosexual (a 
metropolitan heterosexual man who is overly concerned with 
his physical apearance). In this case, the trochaic metrical foot 
plays a role: the word heterosexual is pronounced [.hstiou 
' sekjol]. The rhyming process, based on the trochaic metrical 
foot, is again at work here: speakers know that the hetero 
morpheme may be bisyllabic ([.hstrou]), as is the morpheme 
sexual ([' sekjol]).- 

The recent brand name safetergent (a cleaning product) is 
based on the word detergent: [di'fax^ont] (RP), [di't h 3jd3ont] 
(GA). The foot structure of this word has an initial 
extrametrical unstressed syllable, leaving ['t h 3:d3ont] (RP)/ 
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[ t h 3jd33nt] (GA) as the trochaic metrical foot. We can then 
add the morpheme safe ([seif]) to this metrical foot to produce 
safetergent: [.seift^x^ont] (RP)/ [,seift h 3jd3ont] (GA). 

Similarly, the words kissagram (['kisagiaem]: someone who 
dehvers a telegram-type message with a kiss) and stripagram 
(['stupogiaim]: someone who delivers such a message and 
strips) are formed by analogy with the word telegram, whose 
metrical structure conforms to the basic trochaic structure for 
words of more than two syllables in English in that it has 
antepenultimate primary stress: [f’sfegiaem]. Although the 
morphological structure of telegram is tele + gram , the 
neologisms are not kiss gram and strip gram. The reason for this 
is that the neologisms are modelled on the metrical structure of 
telegram. The English trochaic metrical foot clearly plays a role 
in all of the neologisms we have considered here. 

9.2 Representing Metrical 
Structure 

We have represented primary stress with a superscript diacritic, 
as in [ dsmik], and secondary stress with a subscript diacritic, 
as in [,seko] in the word academic. These conventions will 
suffice if we confine our interest to the level of the word. But 
they will not suffice if we wish to represent the way levels of 
stress and relative perceptual salience operate when words are 
combined into phrases. Take the phrase kangaroo court, for 
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instance. When kangaroo appears in the phrase kangaroo 
court, the secondary and primary stresses in that word switch 
round. The single syllable of the word court has more stress 
than any of the stressed syllables in kangaroo, but in the word 
kangaroo, the [k h aspgo] foot is less salient than the [iu:] foot. 
So we are dealing here with three different levels of salience. 
That is not easy to represent using only the two diacritics we 
have used for word stress. We need a further mode of 
representation. 

We represented syllable structure in terms of branching tree 
structures. Many phonologists also represent foot structure in 
terms of branching trees. We will represent any syllable which 
has any degree of stress with an ‘S’, indicating that it is strong 
with respect to weak unstressed syllables, which we label with 
a ‘W’. A stressed syllable and any unstressed syllables with 
which it forms a foot may then be represented as follows 
( 1 ) 



s w s w w 


wi ti SI 113 Ill 3 

( witty) ( cinema ) 

The bottom-most level of representation in this diagram is 
the level of the segment. The next level up is the syllable. At 
that level, the S labels represent strong (stressed) syllables and 
the W labels represent unstressed syllables. It is important to 
bear in mind that stress levels are relational : rather than a 
stressed syllable being definable in absolute terms, one syllable 


211 









is more or less stressed in relation to another. 

The next level of representation up from the syllable (the 
lines above the S and W labels in this diagram) is that of the 
foot. Each word in (1) consists of a single foot, the first word 
consisting of a binary-branching foot and the second word 
consisting of a tertiary-branching (three-way-branching) foot. 

Monosyllabic lexical words contain, by definition, a single 
stressed syllable. We will take it that they contain a single, non¬ 
branching foot. We will therefore represent such words as 
having a single S-labelled syllable (indicating that it is stressed), 
dominated by a non-branching foot node, thus: 

( 2 ) 

hir. 

S 


hit 

(hit) 

This diagram represents two claims. The first is that the 
syllable in question is stressed (labelled S). The second is that, 
because it is stressed, it constitutes a foot which happens not to 
have a branching structure, since there are no unstressed 
syllables following it. 

Monosyllabic non-lexical function words, such as pronouns 
(e.g. he, she, me, it), prepositions (e.g. in, on, at), articles (a, 
the) and conjunctions (e.g. and, but, if) are typically 
unstressed. We will therefore represent them with a W-labelled 
syllable, but no foot structure above that level (since a foot by 
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definition must contain a stressed syllable), thus: 

(3) 

w 


it 

{it) 

The relational nature of stress levels can be seen clearly in 
words which have both a primary stressed syllable and a 
secondary stressed syllable, such as woodpecker. It is clear that 
the antepenultimate syllable in this word has more stress than 
the penultimate (it is strong with respect to the penult). It is 
equally clear that the final syllable has less stress than the 
penultimate: it is unstressed, and thus weak. We may therefore 
represent the foot structure of the word as consisting of two 
feet, the first of which is stronger than the second, as follows:- 

(4) 



s s w 


wud pc ka 
( woodpecker) 

Note that the S/W notation is used to represent the relative 
strength both of syllables within a foot and of sequences of 
feet: the notation shows that the first syllable is strong with 
respect to the second, and also shows, at a higher level, that 
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the first foot is strong with respect to the second. 

In words such as colonnade and kangaroo, on the other 
hand, it is the second of the two feet which is the stronger, 
since it has a secondary stressed syllable followed by an 
unstressed syllable followed by a primary stressed syllable: 

( 5 ) 



s w s 


ko h neid 
( colonnade ) 

Words such as champagne, which have a primary stressed 
and a secondary stressed syllable, but no unstressed syllables, 
contain two feet, each of which contains only a strong syllable. 
However, one of those feet is strong with respect to the other, 
thus: 

( 6 ) 



w s 


s s 

/tern pein 
(champagne) 

In this word, the second of the two feet is the stronger, 
whereas, in a word such as the noun export, it is the first of the 
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two feet which is the stronger: 

( 7 ) 



s w 


s s 

eks pj:t 

{export) 

Recall that monosyllabic function words are typically 
unstressed (and are thus simply labelled with a W). 
Monosyllabic words of a lexical category may form branching 
feet with such words, as in the phrase hit it. Thus a phrase 
such as hit it contains exactly the same foot structure as a 
single word such as witty : 

( 8 ) 



s w s w 


hit it wi ti 
(hit it) (witty) 

Thus, the constituent we have called the foot does not map 
directly onto the word: there may be more than one foot within 
a word, and a foot may extend beyond the span of a single 
word. Furthermore, a word may not be exhaustively divisible 
into feet. For example, words such as America contain a foot 
consisting of the stressed antepunultimate syllable and the two 
unstressed syllables which follow it; the word-initial, unstressed 
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syllable is a ‘stray’ unstressed syllable, which is part of the 
word, but is not integrated into the foot structure formed by 
the three syllables which follow it (rather in the same way that 
an /s/ preceding an onset consonant may be part of a word 
without being integrated into syllable structure): 

( 9 ) 



s 

W 

w 

nit 

il 

ka 

(America ) 



The word-initial, ‘stray’ (extrametrical) unstressed syllable 
here is parallel to the monosyllabic function words discussed 
above: at the level of the word, it is not integrated into foot 
structure. On this way of analysing English foot structure, 
words which begin with an unstressed syllable, such as 
America, about and maroon, do not consist of a single foot 
which begins with a W syllable, since we are denying that there 
are W-S feet in English. It is only at the level of larger units 
such as the phrase that such unstressed syllables may be 
integrated into foot structure,- as in the verb phrase saw 
America: 

( 10 ) 
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s w s w w 


ss 3 me 11 ka 
(saw America) 

We saw that words like champagne and export are bisyllabic 
and contain two feet. These are fundamentally different from 
words such as maroon, which are also bisyllabic but beging 
with a ‘stray’ unstressed syllable and contain only one foot, 
which consists of the stressed syllable: 

( 11 ) 

w s 

ma ju:n 
(maroon) 

9.3 Phonological 
Generalizations and Foot 
Stucture 

One of the reasons for postulating the foot as a phonological 
constituent is that, just as some phonological generalizations 
are sensitive to syllable structure, so some phonological 
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generalizations are sensitive to foot structure. Take the 
generalization, or rule, of Flapping, in many dialects of 
American English. Under this generalization, /t/ and /d/ are 
realized as an alveolar tap (also known as a flap) between 
vowels, as in Betty and bedding ([bsri]] and [bsrip]). But the 
rule does not apply if a foot boundary occurs adjacent to the /t/ 
or /d/. Thus, the generalization does not cover cases such as 
attacker , or a tacker, since, in those cases, a foot boundary 
intervenes between the first vowel and the /t/, thus: 

( 12 ) 



w s w 

a tie ks 

(attacker, a tacker) 

If a foot consists, as we have said, in a stressed vowel 
followed by any immediately following unstressed syllables, 
then a word such as attacker contains a single foot (which 
begins with the stressed syllable) preceded by a ‘stray’ 
unstressed syllable, as in the word-initial syllable of America. 
Thus the word-initial syllable is not a part of the foot in which 
the /t/ appears, whereas in a word such as Betty, it is: 

(13) 



s w 


be ri 
(Betty) 
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Note that Flapping also occurs in feet which are formed 
across word boundaries, as in hit it ([hint]): 

(14) 



s w 


hi fit 
(hit it) 

That is, Flapping is not sensitive to word boundaries or to 
the morphological structure; rather, it is foot structure which 
matters in the application of Flapping: Flapping only applies 
foot-internally. 

Another example of a generalization which is often said to be 
sensitive to foot structure is the rule of Aspiration. We have 
already noted that we must acknowledge that there are degrees 
of aspiration of voiceless stops in English. However, aspiration 
is at its strongest when the voiceless stop in question is in foot- 
initial position, as in party and appearance, whose foot 
structures are given in (15): 

( 15 ) 



s w w s w 


pa: ti 3 pia Jdns 

(party) ( appearance ) 

Aspiration applies foot-initially (although there may be some 
degree of aspiration in other positions). 


219 












9.4 The Rhythm of English 
Again: Stress Timing and 
Eurhythmy 

We saw, in chapter 8, that the rhythm of English is stress- 
timed. What this means is that the regular recurring beats 
found in the speech of English speakers (the rhythm of English 
speech) fall on stressed syllables. That is, stressed syllables in 
English occur at more or less equal intervals. Languages like 
English are often said to be distinct from languages like French 
in this respect in that, in languages like French, each syllable is 
said to occur at a more or less equal interval (languages of that 
sort are often, therefore, said to be syllable-timed). 

One of the consequences of this kind of rhythm is that 
English feet may consist of a stressed syllable followed by a 
sequence of unstressed syllables, as in the phrase heard in the 
park, in which the stressed syllable in heard is followed by two 
unstressed syllables, or the phrase heard it in the park, where 
heard is followed by three, or the phrase heard it in the 
announcement, where it is followed by four. 

Having said that English allows for really quite extensive 
sequences of unstressed syllables, it has to be said that the 
‘ideal’ or optimal rhythmic structure is one in which strong and 
weak syllables alternate, in an S-W-S-W pattern. It appears to 
be the case that such sequences of ‘alternating opposites’ are 
optimal in a perceptual sense: they seem to make the speech 
signal more easily decoded. Such optimal rhythmic structures 
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are often referred to as eurhythmic stuctures. It follows from 
this that the optimal, most eurhythmic, foot structure is a 
simple S-W structure, with only one unstressed syllable to the 
right of the stressed syllable. Foot structures with more than 
one W syllable are therefore less eurhythmic, less optimal, than 
those with only one, and the greater the number of unstressed 
syllables, the less eurhythmic or optimal the foot. 

This preference for eurhythmy extends to sequences of feet: 
sequences of S and W feet are also more eurhythmic than 
other sequences. For instance, in the sentence I want a cup of 
coffee, there is an S-W-S sequence of three feet in the verb 
phrase, each of which is itself an S-W sequence of syllables; it 
is eurhythmic both at the level of sequences of syllables and at 
the level of sequences of feet: 

( 16 ) 

w 

s w s 

r\ r\ r\ 

s w s w s w 

won ta kA pa ko fi 

(I want a cup of coffee) 

In many cases, however, a given combination of words may 
potentially create a phrase which is less than eurhythmic, and 
indeed may potentially result in adjacent S-labelled feet. This 
results from the fact that, in most English phrases, it is the final 
word which is most stressed, as in the phrase black bird, 
discussed earlier. This Phrasal Stress Rule seems to hold for 


w 

ai 
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most types of phrase in English, as in slowly ate (verb phrase), 
very yellow (adjective phrase), into London (prepositional 
phrase) and very slowly (adverb phrase). It also seems to apply 
at the level of the sentence, as we can see from the example 
just given: the predicate verb phrase is more salient than the 
preceding subject noun phrase. Where the Phrasal Stress Rule 
brings about adjacent S-labelled feet, it appears that ‘evasive 
action’ can be taken. Let us consider some examples. 

Take the words academic, Tennessee and champagne. 
Clearly, academic has primary stress on the penultimate 
syllable and secondary stress on the first syllable; the other 
syllables are unstressed. The foot structure of the word is as 
follows: 

(17) 



w 

r\ 


[\ 

s w 


; w 

se ka 

c 

Ie nuk 

(academic) 


Tennessee also contains two feet, the second stronger than 
the first. However, the second foot consists simply of a 
stressed syllable, with no unstressed syllables following it: 

(18) 
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s w s 


te na si: 

( Tennessee) 

Champagne also has two feet, as we have already seen, the 
first of which consists of a syllable with secondary stress and 
the second of which consists of a syllable with primary stress: 
(19) 



w s 


s s 

fsem pein 
( champagne ) 

In each of these three cases, the word consists of two feet, 
the second of which is strong with respect to the first. 
However, when these words appear in phrases where the 
stronger of the two feet is immediately followed by the stressed 
syllable of another foot, and where that syllable must be more 
heavily stressed than the preceding one, a kind of ‘stress clash’ 
results, in which, rather than a eurhythmic sequence of S and 
W feet, an S-S sequence of feet occurs. In situations such as 
this, a rule of rhythm reversal applies. Consider some such 
phrases, e.g. academic banter, champagne breakfast, 
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Tennessee Williams. Note that, in each case, the rule for 
phrasal stress assignment means that the second of the two 
words must have greater stress than the first. Note too that the 
primary and secondary stresses in the words academic, 


champagne and Tennessee have reversed. That is, the 
offending structure (exemplified in (20) below) is altered to the 
more eurhythmic structure exemplified in (21). 

( 20 ) 


( 21 ) 




s 



s w 

baen ta 


(academic banter ) 



(academic banter) 

This process of rhythm reversal is quite regular in English. 
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Other examples are easily found; consider Piccadilly vs 
Piccadilly Circus, Heathrow vs Heathrow Airport, Dundee vs 
Dundee marmalade, thirty four vs thirty four books, good- 
looking vs good-looking tutor, and so on. As we have seen, in 
English phrases, it is the head, rather than a preceding modifier, 
which bears the most stress. Rhythm reversal occurs whenever 
a word containing a weak-strong sequence of feet is combined, 
to form a phrase or compound, with a word whose first 
syllable is the first syllable of a foot (i.e. is stressed). That is, 
rhythm reversal operates, within the context of phrases and 
compounds, on feet, not syllables, reversing weak-strong 
sequences of feet, rather than weak-strong sequences of 
syllables. Another way of putting this is to say that the reversal 
process reverses a sequence of a secondary stressed syllable 
and a primary stressed syllable when it is followed by a 
primary stressed syllable within a phrase. 

Reversal does not operate on a sequence consisting of an 
unstressed syllable and the first syllable of a foot, as in maroon 
sweater. The word maroon contains a single foot, which 
consists only of a stressed syllable with no following unstressed 
syllables; that foot is preceded by a ‘stray’ unstressed syllable 
(just like the unstressed syllable in America, shown above), 
thus: 

( 22 ) 
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w s s w 

nw iu:n sw ta 

(maroon sweater ) 

While maroon sweater contains an S-S sequence of 
syllables, it does not contain an S-S sequence of feet, and it 
does not therefore undergo rhythm reversal. The pervasive 
effects of the Rhythm Reversal Rule are said to constitute 
evidence for the existence of the foot as a consistuent in the 
phonology of English. Furthermore, the fact that words like 
maroon do not undergo reversal can be taken as evidence for 
our claims (a) that English feet always begin with a stressed 
syllable, and (b) that English words are not necessarily 
exhaustively divisible into feet. In other words, a word like 
maroon is not to be analysed as consisting of a foot with a W-S 
sequence of syllables. 

The claim that rhythm reversal operates at the level of 
sequences of feet, rather than at the level of sequences of 
syllables, is supported by the fact that it operates in phrases 
such as good-looking tutor, which, prior to reversal, has the 
following structure: 

( 23 ) 
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w s s 


i r\ r\ 

s s w s w 

god lo knj tju: ta 

(good-looking tutor) 

If reversal operated at the level of sequences of S and W 
syllables (rather than feet), it would not affect the sequence 
looking and tutor, which have an alternating S-WS-W 
structure. It is at the level of the foot that the S-W sequencing 
is violated. 

The most striking aspect of reversal is that it demonstrates 
the interaction of syntax and phonology. The conditions under 
which reversal operates are partly determined by a syntactic 
fact about English: the fact that modifiers typically precede 
heads in English phrases. This, combined with the fact that it is 
the head which receives more stress than the modifier, brings 
about the reversal phenomenon. 

Reversal also interacts with morphological structure: it 
operates within words which contain a suffix which itself takes 
stress. Take the word New York. It consists of two feet, the 
second stronger than the first: 

( 24 ) 
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w 


s 


s s 

nju: p:k 

(New York ) 

The suffix -ese is one of those English suffixes which takes 
stress. It consists of a single non-branching foot, and when it is 
added to New York , the resulting word New Yorkese consists of 
three feet in a W-S-S sequence: 

( 25 ) 



w s s 


s s s 

nju: jj:k i:z 

(New Yorkese) 

This structure meets the conditions for reversal, which then 
applies to yield the S-W-S sequence of feet in New Yorkese. 

It is true, of course, that one can, in fact, utter phrases such 
as academic banter with primary stress on the penultimate 
syllable of academic. When one does this, one is usually 
contrasting some aspect of the phrase with some other 
possibility. One might be stressing, for instance, that one means 
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academic banter and not some other kind of banter, such as 
adolescent banter. The most important fact about this 
phenomenon, often referred to as contrastive sentence stress or 
contrastive intonation (see chapter 10 for details), is that it is a 

discourse phenomenon : one could not make sense of such a 
stressing if it were not accompanied by an appropriate 
discourse setting in which one’s interlocutor stands a chance of 
understanding what other possibilities one is contrasting the 
rhythm-reversed adjective academic with. The following 
exchange is an example: 

( 26 ) 

A: 

I do enjoy academic banter, you know, (rhythm reversal) 

B: 

What kind of banter? 

A: 

.Aca'demic banter, (contrastive stress; no rhythm reversal) 

The stress pattern here is nonetheless distinct from that given 
in (20) above, since the stressed syllable of banter is less 
stressed than the primary stressed syllable of academic. 1 - What 
examples such as these suggest is that reversal is a metrical 
phenomenon which interacts with morphology and syntax, and 
can be described independently of discourse context , whereas 
contrastive intonation is a phenomenon which cannot be 
described independently of discourse context. This suggests 
that it is possible, and perhaps necessary, to distinguish those 
phenomena which can be analysed independently of context of 
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utterance from those which cannot. 


Notes 

1 There are also imperfect rhymes, where the phonetic 
segments following the stressed syllable are similar but not 
identical. The imperfect rhyme between the words manner 
(['masno]) and banger (['baspo]) are examples of this: the 
place of articulation of the nasals is not the same, so banner 
would rhyme better with manner. But it’s not that bad a 
rhyme, after all: both [n] and [ 13 ] are nasals, so they sound 
very similar. 

2 There are other factors at play here, including the 
availability of the morpheme flexi (as in flexitime, itself 
formed by analogy with overtime', flexi itself comes from the 
adjective flexible, in which the prefix is flex). The term 
pescitarian depends on the availability of knowledge of the 
Latin word for ‘fish’. 

3 In this case, the morphology and the metrical structure 
coincide. 

4 The metrical trees we present here are abbreviated, for 
reasons of lack of space. Because metrical structure is 
determined by syllable structure, we ought, strictly speaking, 
to show metrical trees built upon syllable structure trees, as 
follows: 
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s 


w 


o 




O 


R O 


R 


N 


N 


x 


x 


X 


X 


w 


The abbreviated trees used in this chapter do not actually 
show that it is the structure of the rhyme, and not the entire 
syllable, which is crucial in determining metrical structure. 
However, they will suffice for our purposes. 

The abbreviated trees used in this chapter do not actually 
show that it is the structure of the rhyme, and not the entire 
syllable, which is crucial in determining metrical structure. 
However, they will suffice for our purposes. 

5 We label the non-branching foot with an ‘S’, consistent 
with our treatment of monosyllabic words of a lexical 
category. 

6 The possibility of non-alignment of foot structure and word 
structure in English is often exploited by songwriters, who 
structure their songs on the basis of syntax (words and 
phrases), rhythm (metrical structure) and various types of 
repetition, such as alliteration, rhyming and repetition of 
metrical and phonemic structures. It is for this reason that it 
is possible to construct a repetition involving manners 
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(['masnoz]), grammar (['giasmo]), and the word damn 
followed by the first, stray, unstressed syllable of about 
(yielding a foot whose form is ['dsema]). Here, a repetition 
exploits syllable and foot structure while cutting across word 
and phrase structure. The sort of repetition known as 
‘rhyming’ can therefore extend beyond the syllabic 
constituent we have called the rhyme, into foot structure. 

The rhyming of continental and rental in the song ‘Diamonds 
are a girl’s best friend’ is a further example. 

7 It is often suggested that this follows from the fact that 
‘banter’ is somehow ‘given’ in the discourse context. 


Exercises 

1 Draw metrical trees for each of the following words. 
Begin by drawing the foot structure. Where a word 
contains more than one foot, draw a superordinate S-W or 
W-S branching structure, showing which of the two feet is 
stronger. 

(a) pretty 

(b) collided 

(c) sentiment 

(d) bat 

(e) nightingale 

(f) kangaroo 

(g) rabbi 

(h) contract 
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Listen to sound 
files online 

2 Listen to Track 9.1 at www.wilev. com/go/carrphonetics 
and draw metrical trees for the following phrases, as heard 
on the recording: 

(a) sacked a worker 

(b) delighted agents 

(c) very pretty 

(d) Piccadilly Circus (Show the tree for this last 
phrase both before and after rhythm reversal has 
applied) 

3 Listen to Track 9.2 and explain the metrical structure of 
the following expressions, as heard on the recording: 

(a) an ill-advised decision 

(b) a well-decorated bedroom 

(c) West Hampton Wanderers 

(d) a broken-hearted man 

(e) fifteen dollars 

(f) fifty pounds 

(g) dark-green trousers 

(h) champagne cocktail 



Listen to sound 
files online 

4 Further phonetic transcription practice 
Transcribe, with as much phonetic detail as possible, the 
following words as they are uttered on Track 9.3 . 
indicating syllable boundaries, primary stress and (where 
applicable) secondary stress: 
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(a) rudimentary 

(b) unfriendliness 

(c) deconstructible 

(d) opportunity 

(e) Chinese 
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English Intonation 


10.1 Tonic Syllables, Tones 
and Intonation Phrases 

Although we often say that some speakers speak in a 
monotonous manner, the fact is that human beings do not utter 
speech which is monotone in nature: we all inflect our speech, 
creating intonational contours. But what is intonation, exactly? 
It is the use of pitch variation in discourse. What is pitch? We 
have seen that pitch is the auditory impression created by 
variations in the rate of vibration of the vocal folds. Intonation 
is the use of pitch contours over stretches of speech which 
often consist of more than one word. An example is the 
utterance Mary went to the doctor. There are three syllables 
with primary word stress in this utterance: the penultimate 
syllable of Mary, the single syllable of went and the 
penultimate syllable of doctor. But there is additional pitch 
movement on the primary-stressed syllable of doctor. That 
stressed syllable is perceptually more prominent than the 
others, and will tend to be longer in duration, and louder, than 
the other stressed syllables in the utterance. That syllable is 
said to be the tonic syllable. The word ‘tonic’ denotes the fact 
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that this syllable is where the tone falls. The tone is the extra 
pitch movement placed on that syllable. In our example, the 
tone is a falling tone: the rate of vibration of the vocal folds 
decreases as the syllable is uttered, resulting in a transition 
from a higher to a lower pitch. We will represent these as 
follows: 



Listen to sound 
files online 

(1) 'Mary 'went to the /-doctor. ( Track 10.1 at 
www. wilev. com/ go/carrphonctics l 

As we saw in chapter 8 on word stress, the diacritics on 
'Mary and 'went indicate that the following syllables are 
stressed. The underlining on \doc tor indicates that it is the 
tonic syllable (and thus, by definition, stressed), and the 
preceding ‘A’ diacritic indicates a falling tone. This kind of tone 
is typical of declarative utterances, in which the speaker is 
making a statement, as opposed to, say, posing a yes/no 
question. Other tones are possible in English. In yes/no 
questions (questions which may solicit the responses ‘Yes’ or 
‘No’), it is common to find a rising tone in the tonic syllable, 
as in the question Is Mary pregnant ? We will represent rising 
tones as follows: 

0 

(2) Is Mary / preg nant? ( Track 10.2 ) 

Here, the penultimate syllables of Mary and pregnant are 
stressed, the tonic falls on the stressed syllable of pregnant, 
and the tone is a rise. 

A third tone is the rise-fall tone, in which the pitch rises and 
then falls, as in the following exchange: 
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(3) 

Wife: 

Have you been ' seeing /'Mary? 

Husband: 

/VNo! ( Track 10.3 i 

The use of rise-fall tones conveys certainty, exclamation, 
strong conviction or strength of feeling on the part of the 
speaker. In this case, the husband is saying that he has 
certainly not been seeing Mary: the intonation conveys a 
complete denial of the implied accusation. 

A fourth tone is the fall-rise tone, as in the following 
exchange: 

0 

(4) 

Wife: 

Have you been ' seeing /Mary? 

Husband: 

/\No! ( Track 10.4 1 

Here, the pitch falls then rises in the second utterance. Use 
of such a tone conveys hesitation, lack of certainty, 
prevarication or reservation on the part of the speaker. In thi s 
example, the husband is being less than clear and 
straightforward in his response: he is denying that he’s been 
seeing Mary, or trying to suggest that what he’s been doing 
does not really amount to ‘seeing Mary’ in the romantic sense. 

A stretch of discourse which contains a tonic syllable is 
called an intonation phrase (IP), otherwise known as an 
intonation group, intonation unit or tone group. These are also 
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referred to as breath groups, since they constitute units of 
speech in which we expel air from the lungs. When we stop 
speaking to draw breath, we often do so at the end of a tone 
group: it is common for speakers to pause at the end of such 
units. 

It is common to identify three main features of intonation: 

• the chunking up of stretches of speech into IPs, 

• the placing of the tonic on one of the stressed syllables 

of that chunk and 

• the assignment of a specific tone on the tonic syllable 

These are, to some extent, independent variables: what 

syllable we choose to put the tonic on can be independent of 
where the IP boundaries go, and what tone we place on the 
tonic syllable can be independent of where we choose to place 
the tonic. 

In examples (1) to (4), the tonic falls on what is known as 
the last lexical item (LLI). Recall from our discussion of word 
stress that words can be classified into two broad groupings: 
words of a lexical category (typically nouns, verbs, adjectives 
and adverbs) and words of a functional, or grammatical, 
category (such as articles, conjunctions, prepositions and 
pronouns). The last lexical item in a syntactic unit is thus the 
last noun, verb, adjective or adverb. For example, in (1), the 
LLI is the noun doctor , and in (2) it is the adjective pregnant. 
The following examples contain, in (5), an LLI which is a verb, 
and in (6), an LLI which is an adverb: 



(5) My 'husband \ cheats . ( Track 10.5 ) 

(6) His 'lover 'walks \ grace fully. ( Track 10.6 ) 
The LLI may not be the last item in an IP, as in: 
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(7) 'Bill \ gave it to her. ( Track 10.7 1 
Here, the last item is a pronoun, which is not a lexical item, 
and thus does not take the tonic. The second-last item is a 
preposition, and thus also fails to take the tonic. The third-last 
item is a pronoun, so that too fails to take the tonic. The tonic 
falls on gave, since it is the LLI, but not the last item. In cases 
like this, any syllables which follow the tonic syllable are said 
to constitute the tail of the IP: after the fall here, the pitch just 
trails off at a low level into the remaining syllables after the 
tonic. 


10.2 Departures from the LLI 
Rule 

The LLI rule is the default rule for the placement of the tonic. 
By ‘default’, we mean the point where the tonic is placed if no 
special circumstances prevail. Defaults in linguistics are rather 
like the default settings on your computer: they are the settings 
that are used unless one deliberately changes the set-up for 
some special purpose. It is common in English to shift the tonic 
away from the default position, for various purposes. We now 
examine some of those. 

10.2.1 Contrastive Intonation 

( 8 ) 

Speaker A: 

'Mary 'gave 'John a Lcamera. 
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Speaker B: 

No, she didn’t give it to him; he gave it to her. ( Track 10.8 ) (= 
VNo | \she 'didn’t 'give it to N him | \he 'gave it to Nher) 

Here, the italics show that, in addition to No, four pronouns 
receive the tonics (the vertical lines indicate the boundary 
between the IPs): it wasn’t Mary that gave John a camera: it 
was John that gave Mary a camera. The referent of the word 
camera here is given, once speaker A has spoken. Consider the 
following possible intonational patterns for the sentence John is 
taking the train to London : 

0 

(9) 'John is 'taking the 'train to Lon don. ( Track 10.9 1 

(10) 'John is 'taking the N train to 'London. ( Track 10.10 ) 

(11) Vlohn is 'taking the 'train to 'London. ( Track 10.11 1 

In (9), we have the default pattern for tonic placement, with 

the tonic on the stressed syllable of the LLI. In (10), we have 
contrastive intonation: the train is being contrasted with some 
other mode of transport, such as the plane. In (11), the speaker 
is stressing the fact that it is John, not someone else, who is 
taking the train to London. 

This use of tonic placement relates to what is called focus. 
In (9), we have broad focus, associated with statements in 
which everything is news. These are statements which are said 
to come ‘out of the blue’: all of the information is announced 
as new information, so everything in the utterance is brought 
into focus. 

In (10), the person being addressed already knows that John 
is going to London: what is news is the information concerning 
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his mode of transport. This is called narrow focus. 

In (11), the addressee knows that someone is taking the train 
to London, and is being informed that the person in question is 
John. This too is a case of narrow focus. Narrow focus relates 
to the given/new distinction. Given information is shared 
(mutual) knowledge, known to both the speaker and the 
hearer. New information is not previously known to the 
speaker and the hearer. 

The tonic can be moved onto almost any syllable for 
contrastive purposes, including affixes. Here is a statement by 
the British prime minister David Cameron in 2010: 

(12) It’s not Nunem'plovment that will be cre'ated. | It’s 
^employment. 

Normally, employment and unemployment have primary 
stress on the penultimate syllable, as in We’re 'trying to ere'ate 
emS ploy ment. Here, the prime minister is contrasting 
employment with unemployment. We see from this example 
that, given a context in which we wish to highlight a given 
word in order to contrast it with another, the tonic may be 
placed on something other than the LLI (unemployment in the 
first IP), and that even affixes may receive the tonic (the un- 
prefix in unemployment). 

10.2.2 Given Information 

Another situation in which the LLI rule is flouted concerns the 
notions of given and new information. Consider the following 
exchange: 
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( 13 ) 

A: 

We need tomatoes. (We 'need toNmatoes) 

B: 

We’ve got tomatoes! (We’ve \ got 'matoes) ( Track 10. 12 1 

The word tomatoes is given in the first statement: it has 
already been mentioned, so the information it conveys is given 
(shared by the participants in the exchange). The tonic is 
therefore shifted away from the LLI ( tomatoes ) onto the word 
got. Now consider the following: 

(14) In N most 'cases, | we a'pply the N rule . | but in 
N some cases, | we N don’t . ( Track 10.13 ) 

Here, the LLI in the first IP (cases) is given by the context 
of utterance: if we utter (14), the person we are speaking to 
already knows what the rule is about, and what kinds of cases 
are being spoken about. 

Synonyms can count as conveying given information: 

(15) 

A: 

She’s 'borrowed 'Jane’s Nfrock. 


B: 

VNo. | it’s LMary’s 'dress ( Track 10.14 1 
Here, the word dress isn’t given, but its meaning is, via the 
uttering of the synonym frock. 

Presuppositions can be conveyed via tonic placement: 

(16) 

A: 
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Have you ' spoken to /' John ? 


B: 

I don’t / speak to 'racists ( Track 10.1 5 1 
B is presupposing that ‘John is a racist’ is given information. 
So the tonic shifts from the LLI ( racist ) to the preceding lexical 
item. Speaker A can impose the presupposition that John is a 
racist, even if that is open to question. 

Notice here that the verb speak is given, but nonetheless 
takes the tonic: contrastive intonation can lead to the tonic 
being placed on given information. Here is another example: 

0 

(17) 

A: 

He’s 'going to/Paris. 

B: 

He’s not 'going to /Paris. | He’s 'going to / Lon don. ( Track 
10.16 ) 

Given information can be shared by millions of people (e.g. 
the fact that Barack Obama was elected president of the 
United States) or by as few as two people (e.g. husband and 
wife). 

10.2.3 Final Temporal Adverbials 

It is common to find that LLIs which are in syntactic units 
which have an adverbial function, and which convey 
information relating to time, fail to take the tonic, as in: 
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(18) 'John’s 'going to ^ Lon don on 'Saturday. ( Track 
10.17 1 

Here, Saturday is the LLI, but since the prepositional phrase 
on Saturday is a final temporal adverbial, the LLI within that 
adverbial expression fails to take the tonic. If we were to place 
the tonic on Saturday, that would constitute a case of 
contrastive intonation: 

(19) 'John’s 'going to 'London on ^Saturday ( Track 

10.18 ) (as opposed to some other day of the week). 

When final adverbial expressions are fronted, they tend to 

form a separate IP: 

(20) On ^Saturday, | 'John’s 'going to ^ Lon don. ( Track 

10.19 ) 

10.2.4 ‘Event’ Sentences 

These are rather curious. They are short statements which 
contain intransitive verbs, but the tonic fails to fall on the 
intransitive verb rather than the LLI: 

( 21 ) 

(a) 

The ^kettle’s 'boiling. 

(b) 

The \baby’s 'crying. 

(c) 
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Your Chouse is on 'fire. 


(d) 

The \ sun’s 'come 'out. ( Track 10.20 ) 

One would have expected the tonics to fall on the LLIs 
boiling, crying and fire, and on the particle out (see above on 
LLIs, and below on intransitive phrasal verbs, where we expect 
the tonic to fall on the particle). 

It has been observed that the subjects in such sentences do 
not denote human agents, but why that should affect the tonic 
placement is far from clear. There seems to be pragmatic 
foregrounding (selecting out) of the subject in such cases. 


10.2.5 Non-Lexical Items which 
Often Take the Tonic 

The negative equivalents of the non-lexical items someone, 
something, somewhere, somebody (no one, nothing, nowhere, 
nobody ) often take the tonic: 

( 22 ) 

(a) 

I 'saw Lno one. 

(b) 

I’ve 'done Lnothing. 


(c) 

We’re 'getting \nowhere. 
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(d) 

Thi s 'interests Ninobody. ( Track 10.21 ") 

One can place the tonic on the non-lexical items someone, 
something, somewhere and somebody, but only if the 
intonation is contrastive, as in: 

(23) 

A: 

I 'saw the 'neighbour in the N pine grove this 'morning. 

B: 

You ^ could n’t have. | He’s in \Paris 'right 'now. 

A: 

Well I 'saw ^ some one. ( Track 10.22 ) 

(That is, not the neighbour, but some other person). 

Non-native speakers should be aware that pro-forms such as 
one and do so are not lexical items: they convey given 
information, and thus do not normally take the tonic, as in: 

(24) 

(a) A: I 'went 'looking for a 'bottle of X flvine . 

B: Did you / "get one? 

A: \Yes. 

(b) 'Mary 'drank some \wine | and X iBill did so too. (Track 10.23) 

10.2.6 Cleft Sentences 

Cleft sentences take the following form: 

(25) It’s Scotsmen that wear kilts. 
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It was Bill that did it. 

Clefting is a way of highlighting, or bringing into focus, a 
syntactic constituent. One could say: 

(26) I ' love Mohn . 

But in the cleft version, the contrast between John and 
anyone else is more emphasized: 

0 

(27) It’s Mohn that I 'love. ( Track 10.24 ) 

Although love is the LLI, the tonic falls on the highlighted 
item. Here, the material after the highlighted material forms the 
tail of the IP: the hearer knows that the speaker loves 
someone, so that knowledge is given, and no further tonic is 
required. 

10.2.7 Deictic Expressions 

The word deictic means ‘involved in pointing out’, either by 
literally pointing at something with one’s finger while speaking, 
or bringing something to someone’s attention without 
physically pointing. Deictic expressions in English include the 
demonstrative words this, that, these and those. These count 
as function words, so they do not take the tonic when the LLI 
rule applies, as in the following questions: 

0 

(28) 

(a) 

Could you /’ give me that? 

(b) 
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Can I have /’ five of those? ( Track 10.25 ) 

In such utterances, it is clear from the context of utterance 
what that and those are being used to refer to (in the first 
example, a parent might be asking a child to hand her a knife; 
in the second example, a shopper might be asking a shopkeeper 
to give her five oranges). 

If the thing being referred to is explicitly mentioned, the LLI 
rule will assign the tonic to the noun in question, as in: 

(29) 

(a) 

Could you ' give me that / knife ? 

(b) 

Can I have 'five of those /oranges? ( Track 10.26 1 
The tonic can fall on deictic expressions when they are being 
used contrastively, as in: 

0 

(30) 

(a) 

Could you 'give me / that 'knife? 

(as opposed to some other knife) 

(b) 

Can I have 'five of / those 'oranges? ( Track 10.27 1 
(as opposed to some other varieties of orange) 

10.3 IPs and Syntactic Units 
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10.3.1 Syntactic Units which 
Normally Form a Separate IP 

There are syntactic units which normally form separate IPs, 
such as relatively short main clauses, as in (1) to (3) in section 
10 .1: here, we can see that the intonation tracks the syntax. 
This is unsurprising, since both clauses and IPs convey 
coherent chunks of information. We will now examine a range 
of other syntactic units which normally form independent IPs. 

10.3.1.1 Parentheticals 

Parenthetical information is extra, optional information offered 
by the speaker. If parentheticals are omitted from a syntactic 
structure, the structure in question remains grammatically well- 
formed. Let us look at some types of parenthetical. 

Non-restrictive relative clauses 



(31) The 'guys in the Near. | who were ^ hung ry. | 'ate 
some ^ sand wiches. ( Track 10.28 ) 

Here, the IP boundaries correspond to the commas in the 
written form of the sentence. The tonics fall on the LLIs in 
each IP: car, hungry and sandwiches. The meaning conveyed 
is that all of the guys in the car were hungry (thus the 
expression ‘non-restrictive relative clause’: the range of 
referents is not restricted). 

(Restrictive relative clauses 

Note that these do not normally count as parentheticals, and 
thus do not normally form a separate IP, as in: 

(32) The 'guys in the 'car who were L hung ry | 'ate some 
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^ sand wiches. ( Track 10.28 ) 

In the second example, there are only two IPs, as opposed to 
three in the first. The subject noun phrase is sufficiently long to 
form a separate IP, but the restrictive relative clause within that 
noun phrase does not form a separate IP. The difference in 
meaning between (31) and (32) is that, in (32) it is not 
necessarily the case that all of the guys in the car were hungry: 


the meaning is restricted only to the hungry guys in the car.) 
Noun phrases in apposition 

Noun phrases are said to be in apposition when they are co- 
referential, that is, when they are being used to refer to the 
same person or entity, as in: 



(33) 'Barak ONbama, | a 'Democrat politician, | is 
inNtelligent. ( Track 10.29 ) 

Other parentheticals 

(34) N/’Mary, | you’re not 'going to beN/ Teve this, | but 
'Jane is T\ prcg nant! ( Track 10.30 1 

Note that parentheticals are uttered on a lower pitch range 
than the preceding and following IPs: if you listen carefully to 
Tracks 10.28 . 10.29 and 10.30 . you should be able to hear 
this. 


10.3.1.2 Co-Ordinated Constituents 

(35) 

(a) 

'Mary 'moved to ^Paris | but 'John 'stayed in VLondon. 
(Sentence co-ordination) 
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(b) 

'John 'went to the Apub | and 'ordered a A beer . (Verb phrase 
co-ordination) 


(c) 

She’s 'very A taU | and 'very A pretty. (Adjective phrase co¬ 
ordination) 

(d) 

His 'very A well | and 'very A quick ly. (Adverb phrase co¬ 
ordination) 

(e) 

It’s 'either 'in the V fridge | or 'on the A table. (Prepositional 
phrase co-ordination) 

(f) 

He 'bought the 'house on the Ahill | and the 'woods in the 
Avallcy. (Noun phrase co-ordination) ( Track 10,31 1 

0 

However, when the constituents are short, separate IPs are 
not always required: 

(36) 

(a) 

She’s 'tall and A lanky . (Adjective phrase co-ordination) 

(b) 

He ' stopped and A stared . (Verb phrase co-ordination) 

(c) 

He 'bought 'milk and Acheese. (Noun phrase co-ordination) 
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( Track 10.32 ) 

Th is is especially noticeable when co-ordinated elements 
have been lexicalized (formed into lexical items), taken to be 
used to refer to single entities, such as: 

(37) 

(a) 

'fish’n’ \ chips 

(b) 

'beer and N skittles 


(c) 

' strawberries and \ ere am ( Track 10.33 ) 

Other co-ordinated items that are used to refer to what is 
perceived as a single entity or unit are British pub names: 

(38) 

(a) 

The 'Dog and \Duck 

(b) 

The Fox and ^ Hounds 
and names of couples: 

0 

(C) 

'Bill and \ Mary 'Jane and "v Clive ( Track 10.34 ) 

(The couples here are considered to be ‘an item’.) 

10.3.1.3 Items on Lists 

Normally, each item on the list constitutes a separate tone 
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group 

(39) He bought /’ eggs . | /' milk . | to/'matoes | and / ham . 
( Track 10.35 1 

Non-final items on a list often take a rising tone: this signals 
that the list is not yet complete. 

10.3.1.4 Subordinate Clauses 

When a sentence contains a subordinate clause, the clause 
boundary often corresponds to an IP boundary: 

(40) 

(a) 

I’ll 'buy the 'fish’n’ \ chips | when I 'go to the N shops . 

(b) 

I 'told the 'new re'emit to the / com pany | that he was / fired . 
( Track 10.36 1 

If the material preceding the subordinate clause is relatively 
short, the subordinate clause need not form a separate IP: 



(41)1 'think she’s been / sacked . ( Track 10.37 1 

10.3.1.5 Sentence Adverbials 

Adverb phrases necessarily have an adverbial function. But 
other phrases, notably prepositional phrases, can have an 
adverbial function. 

It is common to distinguish verb phrase adverbials from 
sentence adverbials, as in (42a) and (b) respectively: 

(42) 

(a) 
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John went to the interview hopefully. 


(b) 

John went to the interview, hopefully. 

In (42a), John was hopeful. In (42b), the speaker is hopeful. 
Notice that (42b) can be rephrased with the sentence 
adverbial in initial position: 

(c) 

Hopefully, John went to the interview. 

In either case, sentence adverbials form a separate IP, and 
they have the fall-rise tone: 

(43) 

(a) 

N/’ Hope fully. | 'John 'went to the ^interview. 

(b) 

'John 'went to the ^interview, | \/' hope fully. ( Track 10.38 ) 

10.3.1.6 Pseudo-Clefts 

These take the syntactic form What he needs is a bath. It is 
common for these to form two separate IPs: 

0 

(44) What he N/' needs | is a \ bath . ( Track 10.39 ) 

10.3.1.7 The is .. . is that Construction 

(45) 

(a) 

The 'thing N/'is | is that she’s N preg nant. 
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(b) 

The alternative \/Ts | is that we’ll have to be'come more in 
'volved in \Europe. 


(c) 

My con'cem \S\s | is that it’s 'got 'too Vbig | too V quick ly. 

(d) 

The 'fact of the 'matter \Ais | is that the ’way it is Vrun | is 
too ^ com plex. 

(e) 

The ’good ’news \/*is | is that they’re Tending to 'small 
^businesses. ( Track 10.40 ) 

This construction is very widespread in spoken English, both 
informal and formal (most of these examples are taken from 
formal interviews with British politicians on BBC TV). 
Although it is possible not to have an IP boundary after the 
first is, that is the norm. 

10.3.2 Syntactic Units which Do Not 
Normally Form Separate IPs 

10.3.2.1 Reporting Clauses 

These abound in novels, but also in everyday speech 
(46) I’m V tired , he said. ( Track 10.41 ) 
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Here, the falling tone on the tonic syllable keeps trailing off 
into the reporting clause (he said) 

It is possible, however, to form a separate IP around a 
reporting clause. Compare (47a) and (b): 

(47) 

(a) 

He’s retired, I think. 

(b) 

He’s retired, 11 A think. ( Track 10.42 ) 

Utterance (47b) conveys less certainty than (47a). 


10.3.2.2 Subject Noun Phrases 

(48) The old man kicked the \ dog . ( Track 10.43 ) 

But, as we saw in earlier, the longer a subject noun phrase, 
the more likely it is that a separate IP will be possible. 

10.3.2.3 Restrictive Relative Clauses within 
Subject Noun Phrases 

As we have seen, these do not normally form a separate IP, as 
in example (32), repeated here (the long subject NP forms a 
separate IP, but not the relative clause): 

(32) The 'guys in the 'car who were ^ hun gry | 'ate some 
^sandwiches. 
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10.3.3 Syntactic Units which May, 
or May Not, Form Separate IPs, 
Depending on the Sense Conveyed 

10.3.3.1 Tag Questions 

Reverse polarity tag questions 

By reverse polarity we mean that the first part is in the 
positive, and the tag question in the negative, or vice versa, as 
in: 

0 

(49) 

(a) 

You’re going to do this, aren’t you? 

(b) 

You’re not going to do this, are you? f Track 10.44 1 
The intonation of reverse polarity tag questions works as 
follows: if we form a separate IP on the tag question, and place 
a falling tone on it, the tag invites agreement: 

(50) 

(a) 

You’re 'going to \do this, | \aren’t you? 

(b) 

You’re not 'going to \do this, | \ are you ? ( Track 10.44 1 
If there is a rising tone on the tag, it need not form a separate 
IP: 

(51) 
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(a) 

She’s 'coming to the /'party, isn’t she? 


(b) 

She’s not 'getting /'married, is she? ( Track 10.45 ) 

Here, the tag question forms part of the tail of the IP. But 
the tag question can form a separate IP: 

0 

(52) 

(a) 

She’s 'coming to the /party. | /isn’t she? 

(b) 

She’s not 'getting /married. | /is she? ( Track 10.46 1 
The differences in conveyed meaning are subtle: in (51a), 
the speaker is not entirely certain whether she’s coming to the 
party. In (51b), the speaker may be expressing surprise, or 
even astonishment, whereas in (52a) and (52b), the speaker is 
a lot less sure, and is posing more of a query than in (51a) and 
(51b). 


10.4 Tonic Placement, IP 
Boundaries and Syntax 

10.4.1 Phrasal Verbs 

Phrasal verbs in English have two parts: the first part, which 
looks like a normal verb, and the second part, which looks like 


258 




a preposition, and is often called a particle. They can be 
transitive (which means that they are followed by a direct 
object, as in He chatted up the waitress, where the waitress is 
the direct object) or intransitive (which means that no direct 
object is required, as in He backed down). Learners of English 
as a foreign language are well-advised to learn the intonation of 
such verbs, since there are so many of them, and they occur 
with high frequency in spoken English. 

10.4.1.1 Transitive Phrasal Verbs 

If the direct object noun phrase is phrasal, the tonic falls on the 
head noun in the noun phrase (the head noun is the noun in a 
noun phrase which is semantically the most prominent): 

(53) 

(a) 

He 'chatted 'up the ^waitress. 

(b) 

He 'chatted the ^waitress 'up. 

If the direct object noun phrase is a pronoun, the tonic falls 
on the particle (which is normally shifted so that it follows the 
direct object): 

0 

(c) He chatted her \up. ( Track 10.47 ) 

10.4.1.2 Intransitive Phrasal Verbs 

These take the tonic on the particle: 

(54) He 'backed \ down . ( Track 10.48 1 
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However, many short sentences with intransitive phrasal 
verbs are ‘event sentences’ (see 10.2.4 above), in which case 
the tonic is retracted: 

(55) 

(a) 

The A plane 'blew up. 

(b) 

The \car 'broke down. ( Track 10.49 1 

10.4.2 Degree Adverbials 

The most central example of a degree adverbial is the word 
very. It functions to modify adjectives in adjective phrases, and 
adverbs in adverb phrases, as in: 

(56) 

(a) 

He’s very Mall. (Adjective phrase) 

(b) 

He 'talks very \ slow ly. (Adverb phrase) ( Track 10.50 1 
Other degree adverbials include so, incredibly, the mild 
swearword bloody and the stronger ‘f-word’, as in: 

0 

(57) 

(a) 

He’s so \ stup id. 


(b) 

He’s in'credibly A arrogant. 
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(C) 

He’s 'bloody \ good . 


(d) 

He’s 'fucking \good. ( Track 10.50 ) 

But the tonic can be placed on the degree adverbial, for 
emphasis, as in: 

0 

(58) 

(a) 

He’s /'\so 'stupid 

(b) 

He’s inT\crediblv 'arrogant! 

(c) 

He’s /'Nibbpdy ' 

(d) 

He’s fucking 'good! ( Track 10.51 ) 

There is a use of so in colloquial English which acts as a 
verb phrase adverbial, and takes the tonic, with an exclamatory 
tone, as in: 

(59) 
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(a) A (to C): Has 'anyone ever \ told you | that you’ve 'wasted your 

/'life? 

B: You /\so 'haven’t! 

(b) A: Why do 'men 'dress up as 'women at 'fancy \dress 'parties? 
B: They /"siso do! 

(c) I’m /\so not 'shining 'shoes! (The speaker refuses to shine shoes) 

(d) 'You two are /\so 'going 'out with the 'wrong 'men. 

(e) That’s /Vso not 'cool, 'Carol. (Track 10.52) 


10.5 Tones and Syntax 

10.5.1 WH Questions 

These normally have a falling tone: 

(60) Where are you Agoing? ( Track 10.53 ) 

But not when used echoically: 

(61) 

A: 

I’m 'moving to A Lon don. 

B: 

/’ Where are you moving to? ( Track 10.54 ) 

Speaker B here has either not properly heard what A said, or 
is expressing incredulity. B’s WH question is said to be echoic 
in that it echoes part or all of what A has just said. Notice that 
the tone keeps on trailing upwards in the tail of the IP, just as it 
trails downwards when a tonic syllable before the tail has a 
falling tone. 
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10.5.2 Declaratives as Questions 

We can use syntax to form yes/no question, in which the first 
auxiliary verb in the main clause is inverted around the subject 
noun phrase, as in: 

(62) (a) Have they ' found your /"mobile? 

The corresponding declarative statement would have a falling 
tone: 

(b) They’ve 'found your /mobile. 

But we can retain the declarative syntactic structure and still 
ask a question by placing a rising, rather than a falling, tone on 
theLLI: 

0 

(c) They’ve found your /mobile? ( Track 10.55 1 
Declarative structures can be uttered with a fall-rise: 

(63) 

A: They’ve 'found my //mobile! 

B: They’ve 'found your //mobile? (Track 10.55) 

In using a fall-rise, B is expressing surprise that the mobile 
has been found, or that it is the mobile , as opposed to some 
other lost object, that has been found. 

10.6 Tonic Placement and 
Discourse Context 
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10.6.1 Vocatives 

\bcative expressions are used for addressing one’s interlocutor, 
as in the following extracts from telephone messages left by a 
speaker calling a friend called Nick: 

(64) 

(a) 

Nick, it’s me. 

(b) 

It’s me, Nick. 

Here the speaker assumes that the addressee (Nick) can 
identify the voice of the person calling, or that Nick is 
expecting a call from that person. 

Initial vocatives form a separate IP: 

(c) VNick . | it’s \me. 

Final vocatives do not: 

0 

(d) It’s Vme, Nick. ( Track 10.56 ’) 

If an IP is formed around the final Nick, as in (65): 

(65) It’s Vine, | VNick . ( Track 10.57 1 
then the word Nick is not interpreted as a vocative; rather, it 
is interpreted as the name of the caller. 

10.6.2 Other Meaning Differences 
Conveyed by IP Boundaries 

As we’ve seen, the placement of IP boundaries, and/or the 
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kind of tone we select, can convey differences in meaning. 
Consider: 

( 66 ) 

(a) 

He didn’t marry her because she was French. 

(= He 'didn’t 'marry her because she was \French) 

This means that he did marry her, but not because she was 
French. 

(b) 

He didn’t marry her, because she was French. ( Track 10.58 ) 

(= He 'didn’t \ marry her | because she was ^ French ! 

This means that he didn’t marry her, the reason being that 
she was French. 


10.7 Summing Up 

We have seen that there are three main structural aspects of 
English intonation: the dividing up of utterances into 
intonational phrases which are chunks of information, the 
placing of a tonic on one of the stressed syllables in each 
chunk, and the kind of tone we use in that tonic syllable. 
Intonation in English, we have seen, is connected to syntactic 
structure, the lexical vs functional distinction, the meaning 
expressed by the syntactic units in question, and aspects of 
discourse linked to the context of utterance and phenomena 
such as conveyed meaning and the speaker’s attitude towards 
what she or he is saying. Perhaps the most striking aspect of 
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English intonation is the extent to which it is dynamic, in the 
sense that speakers of English frequently move the tonic away 
from the default LLI position, for a wide variety of purposes. 
For non-native speakers of English, some degree of mastery of 
this will result in a much more native-like speech style. 


Exercises 

1 In each of the following utterances, identify the last 
lexical item. 

(a) John went to the pub. 

(b) Mary put her finger on it. 

(c) My father says he can’t understand that. 

(d) He talks rather slowly. 

(e) I want that pink one. 

2 Which of the following questions can have a rising tone? 

(a) Is Bush mad? 

(b) What do you want? 

(c) Have you eaten? 

(d) How does this work? 

(e) Isn’t it time for lunch? 

0 

3 Listen to Track 10.59 . Where do the tonics fall in the 
following utterances in that sound file? Explain why. 

(a) She chatted up the waiter. 

(b) She chatted him up. 

(c) She broke down. 

(d) We’ve split up. 


266 



(e) I’ve put him off. 

4 Listen to Track 10.60 . Identify the IP boundaries in the 
following utterances in that sound file and say where the 
tonics fall. Explain why. 

(a) Mary, you’re fired. 

(b) You’re fired, Mary. 

(c) He’s mad, she said. 

(d) It’s an evil empire, said the president of the 
United States. 

5 Listen to Track 10.61 . Identify the IP boundaries in the 
following utterances in that file and say where the tonics 
fall. Explain why. 

(a) Mary, a good friend of mine, is pregnant. 

(b) The guys in the car, who were hungry, ate some 
sandwiches. 

(c) The guys in the car who were hungry ate some 
sandwiches. 

(d) Bill, you won’t believe this, you’ve passed your 
exam. 

(e) His new book, Making Friends, is sure to be a 
bestseller. 

6 What is the default tonic placement in the following 
utterances? 

(a) He went to London on Thursday. 

(b) I haven’t seen her recently. 

(c) He left for Paris in a hurry. 

(d) She left her bedroom in a mess. 

(e) He speaks quickly. 

0 
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7 What are the intonation group possibilities in the 
following utterances? Discuss different tone possibilities in 
the tag questions. Listen to Track 10.62 and describe the 
intonational structure you hear, including IP boundaries 
and tones. 

(a) You’re not pregnant, are you? 

(b) You like lasagne, don’t you? 

(c) You do play golf, don’t you? 

(d) We can sort this out, can’t we? 

(e) We’ll never sort this out, will we? 

8 Where do the tone group boundaries fall in the following 
utterances, and where do the tonics fall? If there are 
alternative intonational structures, say what you think they 
are. Listen to Track 10.63 and describe the intonational 
structure you hear, including IP boundaries and tones. 

(a) ‘You can’t go’, said Bill Smith, a good friend of 
mine. 

(b) ‘Obama can’t win in Texas’, claims Hillary 
Clinton, a woman whose husband Bill, ex-president, 
is from the South of the USA. 

(c) ‘Is Amy Winehouse in rehab?’, asked Jonathon 
Ross on the Thursday after she sang out of tune at a 
concert in London. 

(d) ‘Dickens I can’t stand’, confessed the young 
recruit to a university lectureship in Victorian 
literature. 

(e) ‘What does George Bush, a devout Christian, 
have to say about the treatment of prisoners in Abu 
Ghraib?’, asked the chair of the committee. 
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11 


Graphophonemics: Spelling- 
Pronunciation Relations 

11.1 Introduction 

The relationship between spelling and pronunciation in English 
is complex: it can seem completely arbitrary. The complexities 
have historical sources: they result from changes introduced by 
scribes after the Norman Conquest, the adopting of many 
loanwords with foreign spellings, and, above all, changes in the 
phonological system of English as the language evolved from 
Old English, through Middle English, into Early Modem 
English and present-day English. Despite the complexities, 
there are some basic regularities which are worth learning, 
especially for non-native speakers of English. 

We begin by distinguishing between letters and graphemes. 
There are twenty-six letters in the Roman alphabet, but there 
are more than twenty-six visual symbols (graphemes) for 
representing English phonemes and allophones, since 
combinations of letters can be used to represent a given 
phoneme or allophone. Examples are <ph>, which corresponds 
to the /f/ phoneme, as in the word photograph, <th>, which 
corresponds to both the /0/ and 16/ phonemes, as in the words 
think and this, and <oa>, which corresponds to the RP and GA 
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phonemes /ou/ and the SSE (Standard Scottish English) 
phoneme /o/, as in the word boat. We will refer to such 
graphemes as digraphs, since they contain two letters. We will 
refer to graphemes with three letters, such as <sch>, in words 
such as schmaltzy (‘overly sentimental’), as trigraphs. The 
distinction between letters and graphemes can be seen in the 
different writing conventions of English and French. For 
instance, in writing the initial for my first name in English, the 
convention is to use the first letter ‘P. Carr’. The French 
convention is to select the first grapheme'. ‘Ph. Carr’. The term 
graphophonemics is the name given to the study of the 
relationship between graphemes and phonemes (and some of 
their allophones). We will begin by examining vowel graphemes 
in English, and then proceed to consonant graphemes. 

11.2 Vowel Graphemes and 
Their Phonemic Values 

11.2.1 Vowel Monographs 

Let us begin with the five vowel graphemes <a>, <e>, <i>, 
<o>, <u>. We will distinguish between two different phonemic 
values for these graphemes: their checked values and their free 
values. The terms checked and free derive, historically, from 
facts about syllable structure. Take the word bite: in Middle 
English, this word was bisyllabic: /hi:to/, in which the first 
syllable is /bi:/, and the second syllable is /to/. These are both 
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open syllables: they contain no coda consonants. What we are 
referring to as free values derive, historically, from open 
syllables. The word bit, in Middle English, was pronounced 
/bit/, a monosyllabic word with a closed syllable, containing the 
coda consonant /t/. What we are referring to as checked values 
derive from closed syllables: a closed syllable is a checked 
syllable. Notice that, in contemporary English, the word bite 
has lost the final schwa, and is now monosyllabic: /bait/. But it 
retains a free value: /ai/, the historical descendant of the long 
Id vowel. The word bit retains a checked value: /i/, the 
historical descendant of the short N vowel. 

In the kinds of monosyllabic words we will consider here, 
four of these graphemes have a checked and a free value, and 
<u> has two checked and free values, as follows (examples 
from RP): 

( 1 ) 



Free value 

Example 

Checked value 

Example 

<a> 

lei/ 

made 

/a?/ 

mad 

<e> 

li:l 

Pete 

let 

pet 

<i> 

/ai/ 

hide 

hi 

hid 

lot 

laul 

note 

/o/ 

not 

hil 

/ju:/ or /u:/ 

cute 

/a/ or /u/ 

cut 


While the final <e> remains in the spelling of the words in 
the first column of examples, the schwa which it once 
corresponded to was elided over time, as we have noted. We 
will refer to this grapheme as mute e. In stressed monosyllabic 
words, if there is a mute e, the preceding vowel grapheme 
corresponds to the free value. Note that the names of these 
graphemes have the free value: we call them /ei/, Id, /ai/, /ou/, 
/ju:/. 

In a stressed monosyllabic word without a mute e, the 
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preceding vowel grapheme corresponds to the checked value. 
This is true when there is more than one consonant grapheme 
at the end of the written word, as in: 

( 2 ) 

apt, act 
bell, text 
Bill, width 
accost, knots 
butt, tuft 

These free/checked values figure in alternations in 
pronunciation when certain affixes are added to bisyllabic 
words, as in the following examples, from RP: 


(3) 

<a> 

sane 

/sein/ 

sanity 

/'sfeniti/ 

<e> 

obscene 

/ab'si:n/ 

obscenity 

/ab'semti/ 

<i> 

divine 

/di'vain/ 

divinity 

/di'vimti/ 

<o> 

verbose 

/v3:'lm>s/ 

verbosity 

/v3:'bositi/ 

<u> 

consume 

/kan'sjuan/ 

consumption 

/kan'sAmp/an/ 


In stressed monosyllabic words, there are additional values 
for these vowel graphemes in non-rhotic accents such as RP, 
where the historical loss of [j] in coda position has resulted in 
changes to the preceding vowel: 

(4) ‘Pre-r’ checked and free values in RP 


273 




Free value 

Example 

Checked value 

Example 

<a> 

/ea/ or /e:/ 

mare 

la:l 

bard 

<e> 

/ia/ 

mere 

Ixl 

perk 

<i> 

/aia/ or [a:] 

lire 

Ixl 

bird 

<o> 

Ixl 

bore 

h:l 

stork 

<u> 

/(j)ua/ or l{])y.l 

sure 

Ixl 

curt 


In contemporary RP, the /(j)o:/ pronunciation of words like 
sure seems to be slowly replacing the previous /(j)uo/ 
pronunciation, so that shore and sure are homophones. Many 
words like mare are now pronounced /e:/, rather than /so/. 
Monophthongal versions of /aio/ have a long [a:] in RP: there is 
variability here among RP speakers. 

As we have seen, GA is rhotic, so the free and checked 
values in ‘pre-r’ position differ somewhat from the RP values 
(we include the 111 for clarity): 

(5) ‘Pre-r’ checked and free values in GA 



Free value 

Example 

Checked value 

Example 

<a> 

leil 

mare 

lal 

bard 

<e> 

/iu/ 

mere 

lul 

perk 

<i> 

laul 

hire 

hit 

bird 

<o> 

toil 

bore 

hi 

stork 

<u> 

lull 

sure 

hi! 

curt 

The same pattern of checked and free values can be found 
in words with more than one syllable with the final 

syllastressed (examples from RP): 



(6) 

Free value 

Example 

Checked value 

Example 

<a> 

let/ 

, lento 'nade 

/«e/ 

tor'bad 

<e> 

li:l 

re'plete 

lei 

re'pent 

<i> 

/ai / 

de'ride 

hi 

for'bid 

lol 

laul 

de'note 

lol 

for'got 

lul 

/ju:/ or lu:l 

de'nude 

l.\l or lul 

un'cut 
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The same is true for ‘pre-r’ vowel graphemes in words with 
more than one syllable, stressed on the final syllable : 

(7) 



Free value 

Example 

Checked value 

Example 

<a> 

lea/ or It:/ 

de'clare 

Iq:I 

re'tard 

<e> 

ha/ 

.inter'fere 

la:/ 

a'ssert 

<i> 

laial 

re'tire 

h:l 

un'gird 

<o> 

h:l 

dep'lore 

h:/ 

re'tort 

<u> 

/(j )ua/ or l(])y.l 

de'mure 

la:l 

curt 


We have seen several examples of words with a stressed 
final syllable and a single consonant grapheme with no mute e, 
such as mad, pet, hid, not, cut, forgot, forbid, uncut’: the 
vowel grapheme in the stressed syllable corresponds to the 
checked value of the grapheme. If we add inflectional suffixes 
to such words, and thus add a syllable, we must double the 
consonant grapheme to convey the checked value: madden, 
petting, hidden, knotting, cutting, forgotten, forbidden. This is 
one sense in which the spelling of English words can actually 
be helpful as a guide to pronunciation: without the doubled 
consonant grapheme, the stressed vowel grapheme would have 


the free value, as in the 

following pairs: 


(8) 

Free value 

Example 

Checked value 

Example 

<a> 

lei/ 

'tamer 

/<e/ 

'tanner 

<e> 

li:l 

'meted 

Itl 

'petting 

<i> 

tail 

'diner 

III 

'dinner 

<o> 

laul 

'no ter 

/o/ 

'hotter 

<u> 

/ju:/ or /u:/ 

'alter 

/a/ or Ini 

'cutter 


The words above are morphologically complex: they contain 
more than one moipheme. But even in words which are 
morphologically simple and have a prefinal stressed syllable, 
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we find the same pattern. The word hammer, for instance, 
does not contain a suffix, but the double consonant grapheme 
encodes the checked value for <a>, namely /se/. The word 
laser is also morphologically simple, but the single consonant 
grapheme encodes the free (lon^tense) value for <a>, namely 
/ei/ (in RP and GA). Unfortunately, there are irregular 
spellings. Take the word panel, which has the checked /se/ 
value: we would expect it to be pronounced with /ei/, or to be 
written pannel, just like flannel and channel. These irregular 
spellings do not help the foreign learner, who often learns the 
spelling of a word before learning the pronunciation. But the 
regularities we have just described nonetheless cover a huge 
number of English words, and are therefore worth learning 
about. 

A further regularity concerns stressed vowels in 
antepenultimate syllables (or earlier in the word than that) 
when they are followed by a single consonant grapheme: these 
typically have the checked value, as in the following words (RP 
pronunciations): 

( 9 ) 


<a> 

character 

/'ktujakta/ 

family 

/'faemili/ 

<e> 

enemy 

/'enami / 

federal 

/'ttdalal/ 

<i> 

cinema 

/'sinama/ 

pitiful 

/'pitiful/ 

<o> 

moribund 

/'mojibAnd/ 

positive 

/'pDZitiv/ 


The <u> grapheme behaves differently: a stressed vowel in 
penultimate position (or earlier), when followed by a single 
consonant grapheme, has the free value for <u>, as in: 

( 10 ) 

frugal 

/'fungal/ 
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imprudent 

/im'piiudont/ 

accumulate 

/o'kju:mjolcit/ 

In addition to the <a>, <e>, <i>, <o>, <u> monographs, the 
<y> grapheme can represent vowels, as follows: 

(ID 

Grapheme 

P honeme s/Allophone s 

Examples 

<y> 

III , /ai/, [i] 

myth, rhyme, happy 

In stressed syllables, <y> has the same values as <i>. The 
checked value /i/ occurs where we would expect it to: words 
like myth, consisting of a single stressed syllable, do not have a 
mute e, whereas words such as rhyme have a mute e, and thus 
have the free value /ai/. We have seen that, in ‘pre-r’ position, 
when followed by a mute e, RP <i> has /aio/, as in tire, which 
can be uttered as long [a:]. The same is true for <y> in this 
position: tyre corresponds to /taio/, which can also be 
pronounced [ta:]. 

For word-final unstressed <y>, we have treated the short [i] 
vowel here as a positional variant (allophone) of the /i:/ 
phoneme in RP and GA, where it occurs in words such as 
happy, and in the many adverbs ending in <ly>, such as lovely. 
Stressed word-final <y> corresponds to /ai/, as in dry and fly. 
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11.2.2 Vowel Digraphs 

The vowel digraphs present problems for foreign learners, 
since they often correspond to more than one phonemic value, 
for reasons connected to the history of English. This is 
especially true for RP, since it is non-rhotic, which complicates 
matters. Let us list some of the most frequent vowel digraphs 
and discuss their phonemic values. 

11.2.2.1 <ai> 

In SSE, <ai> corresponds to the phoneme Id, as in pain 
(/pen/) and fair (/fea/). But, in RP, since it is non-rhotic, there 
is a separate ‘pre-r’ value: pain is /pern/, while fair is /fso/ or 
/fe:/. In this respect, SSE is less complex than RP for many 
learners of English. In GA, there is a complication: pain is 
/pein/, as in RP, but ‘pre-r’ words such as fair, while having 
the /ei/ phoneme, undergo the ‘Marry Merry Mary’ 
neutralization, in which the /as/ vs Id vs /ei/ oppositions are 
neutralized to [e] in ‘pre-r’ position: Marry Merry Mary is 
pronounced as ['msii] ['msii] [ msai], and fair is pronounced 
[fei]. This can cause confusion for speakers of varieties of 
British English when listening to Americans. For instance, 
when my son, while attending primary school in the United 
States for a semester, told me the name of a new American 
schoolfriend, I thought that the boy was called Fed, since my 
son pronounced the name the way a GA speaker would: [fail]. 
In fact, the boy’s name was Far re T : schwa is often elided after 
l\l in GA, and the ‘Marry Merry Mary’ rule applies to the 
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stressed vowel. As the playwright George Bernard Shaw once 
said, Britain and America are two countries divided by a 
common language! 

11.2.2.2 <au> 

In RP, this can correspond either to hJ, as in fraud, or to the 
LOT vowel /d/, as in Austria. Many speakers of GA lack the / 
d/ phoneme, and have /a/ instead, so that Austria has the value 
/a/ in its stressed syllable (though there is variability here). 

11.2.2.3 <ee> 

In SSE, this corresponds to the /i/ phoneme, as in see (/si/) and 
peer (/pu/). In RP, words like see have the /i:/ phoneme, while, 
in ‘pre-r’ position, there is a centring diphthong, as in peer 
(/pis/). In GA, <ee> corresponds to /i:/, but there is often 
neutralization of the opposition between HJ and hi in ‘pre-r’ 
position, so that stir it and steer it can be homophonous. When 
this opposition is neutralized, the resulting vowel can sound like 
[i] or [i]: [stmt] or [stunt]. 

11.2.2.4 <oo> 

For historical reasons, this digraph has three main phonemic 
values in RP and GA: /u:/, as in soon, /u/, as in good, and /a/, 
as in blood. In ‘pre-r’ position in RP, the centring diphthong / 
uo/ occurs, as in poor, but this is increasingly giving way to the 
long monophthong hJ, so that sure and shore are 
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homophones. In GA, the FOOT vowel can be unrounded, and 
the symbol often used to represent that resulting high, back 
unrounded vowel is [ra], as in sure : [fuu], 

11.2.2.5 <ou> 

This digraph can correspond to either /au/, as in how, or to / 
au/ (GA /ou I), as in know. This poses problems for learners of 
English, since there is no rule governing the occurrence of one 
or the other. Worse still, there are forms such as bow, in which 
the verb to bow has /au/, while the noun bow has /ou/. 

In <ought> sequences, RP has hi, as in thought, while 
many GA speakers have /t>/ in such words. 

Word-final <ough> is complex: it can correspond to l\x\l, as in 
through, /au/ (GA/ou/), as in though, /\f/, as in tough, /of/, as 
in trough, and /au/, as in plough. 

Finally, <ou> in words with <ouble> and <ouple> 
corresponds to /a/, as in double and couple. 

11.2.2.6 <ea> 

This is one of the most difficult of the vowel digraphs for 
students learning English as a foreign language. It can 
correspond to HI in RP (and GA), as in the word sea, or hi, as 
in head. There are several ‘pre-r’ values in RP: /ia/, as in fear, 
/ea/ (or, more often in contemporary RP, hi), as in bear, hi, 
as in dearth, and hi, as in hearth. 

In GA, the ‘pre-r’ values include hi, as in dearth (/d3i0/), 
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and /a/, as in hearth. GA also has phonetic [e], as in bear, 
because of the ‘Marry Merry Mary’ neutralization rule. 


11.3 Consonant Graphemes 
and Their Phonemic Values 


11.3.1 Consonant Monographs 

There are consonant monographs in English which have a 
single phonemic value. These are: 

( 12 ) 

Grapheme 

Phoneme 

Example 

<P> 

/p/ 

pit 

<k> 

Ikl 

kit 

<b> 

/b/ 

bit 

<d> 

/d/ 

din 

<j> 

/ds/ 
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joy 

<f> 

/£7 

fun 

<v> 

/v/ 

van 

<z> 

/z/ 

zip 

< 1 > 

/]/ 

lip 

<m> 

Iml 

mind 

<w> 

/w/ 

wet 

There are also monographs with more than one phonemic 
(or phonetic) value: 

( 13 ) 

Grapheme 

P honeme s/Allophone s 
Examples 

<y> 

year 

<n> 

Ini, [m], [it)], [p] 
ip, input, inform, ink 
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<s> 

/s/, /z/, /J7, / 3 / 

sip, rose, compulsion, confuse 

<g> 

/g/, /d 3 / 
got, gin 
<x> 

/ks/, /gz/, /kJ7, /gs/ 

sex, exact, luxury, luxurious 

For the grapheme <y>, we are treating words such as year 
as beginning with a consonant: the palatal approximant /j/. The 
sounds [j] and [w] are often referred to as semi-consonants, 
since they are vowel-like in their articulation, but they occupy a 
consonantal position in syllables, namely the onset position. 

As far as the grapheme <n> is concerned, we have seen that 
the phoneme /n/ undergoes nasal assimilation: if followed by a 
bilabial consonant, the realization is bilabial, as in input; if 
followed by a labio-dental consonant, its realization is labio¬ 
dental, as in info mi; and if followed by a velar consonant, its 
realization is velar, as in ink. These bilabial, labio-dental and 
velar realizations are all allophones of the /n/ phoneme. 

The <s> grapheme corresponds to four phonemes for 
historical reasons: the palato-alveolar pronunciations /J7 and / 3 / 
came about because the following consonant was a palatal 
approximant (known informally as a yod): the sequence [s] + 
[j] led to an assimilation process known as coalescence, in 
which a sequence of an alveolar and a palatal results in a 
palato-alveolar articulation. The same process can be seen in 
connected speech: a word-final /s/, when followed by a yod, 
can lead to assimilation. This can be heard in a song by the 
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Rolling Stones, entitled ‘Miss You’, pronounced ['mijo] by 
Mick Jagger. Similarly, a word-final /z/ followed by a word- 
initial Id can yield a [3], as in He’s yours, pronounced [hi 
'33:z], Words such as rose, the past tense of rise, initially had a 
voiceless [s], often with a word-final schwa: [ro:so]. It is 
common for voiceless segments to become voiced 
intervocalically (between vowels): since vowels are typically 
voiced, an assimilation process takes place which results in the 
intervocalic sound undergoing voicing. Thus [ro:zo], which 
later lost the word-final schwa. Much later, the /o:/ 
diphthongized in RP, resulting in the present-day [iooz] 
pronunciation. 

The <g> grapheme corresponds to the /g/ phoneme, but also 
corresponds to the /d3/ phoneme, again for historical reasons 
relating to assimilation: since the vowel in words such as gin, 
gibberish and gist is a high front vowel, the historical /g/ 
underwent palatalization, and became palato-alveolar. In words 
such as gibe and Giles, the vowel was the high-front IrJ in 
Middle English, and thus induced the same palatalization 
process. Unfortunately, not all /g/ + high-front vowel 
sequences underwent this process, so that words such as give 
and gig, while containing a high front vowel, have retained the 
velar pronunciation /g/. Some words, such as the proper name 
Gill, can be pronounced either way, with the female name Gill 
being pronounced with a [d3], and the male name Gill being 
pronounced with a [g]. Similarly, the word gill, when meaning 
a part of a fish, is pronounced with a velar stop, whereas the 
unit of measurement known as a gill is pronounced with the 
palato-alveolar value. 
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The <x> grapheme corresponds to sequences of two 
phonemes. If we take the /ks/ sequence to be the ‘basic’ value, 
we can see that the same historical processes of voicing and 
palatalization have taken place to yield the other three values. 
The /gz/ sequences in words like exact have arisen from 
intervocalic voicing assimilation. In words like luxury, the 
sequence /ksj/ gave rise to coalescence of the /sj/ sequence, 
resulting in /kJ7. In words like luxurious, both coalescence and 
inter-vocalic voicing have resulted in the sequence. Stress 
also plays a role here: if the preceding vowel is stressed, as in 
sex and luxury, we have a voiceless value; if the following 
vowel is stressed, as in exact and luxurious, we have a voiced 
value. 

11.3.2 Consonant Digraphs and 
Trigraphs 

There are consonant digraphs and trigraphs with a single 
phonemic value: 

(14) 

Grapheme 

Phoneme 

Example 

<ck> 

/k/ 

clock 

<ph> 

/f7 

photo 
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<sh> 

/J7 

ship 

<rh> 

111 

rhubarb 

<dg> 

M3/ 

edge 

<dj> 

M3/ 

adjunct 

<ng> 

/r)/ 

sing 

<tch> 

/tJ7 

itch 

In addition to the consonant monographs with more than one 
value, there are consonant digraphs and trigraphs with more 
than one value. These are: 

( 15 ) 

Grapheme 

Phonemes 

Examples 

<gu> 

/g/, /gw/ 
guard, anguish 
<qu> 

/k/, /kw/ 
unique, queen 
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<ch> 

/tJ7, /k/, /J7 

chair, chaos, chic 
<gh> 

/g/, /f/, zero 
ghoul, tough, thigh 
<th> 

/e/, /a/ 

think, this 

<sch> 

/sk/, /J7 

school, schmaltzy 
<sc> 

/s/, /J7, /sk/ 

scene, conscious, scour 

<gg > 

/g/, /ds/ 

egg, exaggerate 

<cc> 

/k/, /ks/ 

account, accent 

<ss> 

/s/, /J7 

kiss, mi s sion 

The <gu> digraph mostly corresponds to the /g/ phoneme, as 
in guide and vague, but can correspond to the /gw/ sequence. 
This can be found only in the middle of words, where there is 
no morpheme boundary, as in anguish, language and penguin, 
but not in, for instance, vaguely. 

The <qu> digraph in words like queen was introduced by 
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French scribes after the Norman Conquest: words like this had 
previously been written with a <cw> sequence. Words such as 
unique have been borrowed from French with their French 
spellings. 

The <ch> digraph replaced the <c> monograph in words like 
church, which had previously been pronounced with a /k/, and 
is now pronounced with a /tJ7. The value /k/ can be found in 
many words of Greek origin, such as chaos, chemist and 
psychology. The /J7 value is found in fairly recent loanwords 
from French, such as chic, champagne and brochure. We have 
not listed the value /x/ (the voiceless velar fricative), since it 
does not occur in most varieties of English. It does, however, 
occur in Scottish English, in words such as loch (/bx/), as 
opposed to lock (/bk/). While some non-Scottish speakers 
pronounce words such as loch with a [x], most produce a [k] 
when attempting Scottish words such as the place-name 
Auchtermuchty, German words such as Bach or Spanish words 
such as rioja. 

The <gh> digraph used to represent the voiceless velar 
fricative hd, which can be heard in Spanish words like ajo 
(‘garlic’) or German words like Buch (‘book’). The voiceless 
velar fricative no longer exists in RP or GA: it was elided in 
many words, like thigh, daughter and thought, which is why 
we have represented its value as zero. The voiceless velar 
fricative can still be heard among older speakers of Scots, who 
pronounce daughter as [doxtir] and thought as [Ooxt]. 

Most words spelled with <th> correspond to /0/, since <th> 
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words which contain /6/ are words of a non-lexical category 
(function words), such as the, this, that, these, those and then. 
While these are high-frequency words, there are very few of 
them, which is typical of function words. 

Words spelled with the <sch> trigraph, such as schmaltzy, 
schmuck and schmooze are often of Yiddish origin. Yiddish is a 
Germanic language (with vocabulary from Hebrew, Aramaic 
and Slavic languages) spoken by Ashkenazi Jews. The Yiddish 
words which have found their way into English mostly come 
from New York’s Jewish community. Some of these words, 
such as schlep (‘to carry around a heavy object’ or ‘to travel 
somewhere slowly’), are not known throughout the United 
States, and used not to be used in British English, but schlep is 
beginning to appear in British newspapers. Other <sch> words 
are loanwords from German, such as schnapps (a German 
drink) and schnitzel (meat fried in breadcrumbs). Words such 
as school were borrowed from Latin, in which the <sch> 
sequence corresponded to /sk/, and has retained that sequence 
in English for over a thousand years. 

Many words spelled with <sc> were borrowed from French. 
They initially had /sk/ sequences in Latin, but the /k/ had been 
elided in the French pronunciation by the time they were 
borrowed. 

The sequences <gg> and <cc> correspond to the same 
values as <g> and <c>. Doubled graphemes in English do not 
correspond to long consonant phonemes (often called 
geminates), unlike in Italian, where spellings such as mamma 
and pizza correspond to the pronunciations /'mam:a/ and / 
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pit:sa/, where the colon indicates length. Some sequences of 
identical consonant graphemes correspond to sequences of two 
identical consonant phonemes across a moipheme boundary, 
as in unnerve, which has the form / An + 'n3:v/, realized as 
[,An'n3:v]. A sequence of two identical phonemes in English is 
not the same thing as a single, long (geminate) phoneme: there 
is no such phoneme as /n:/ in English. The fact that English 
speakers do not pronounce Italian loanwords such as pizza 
with a geminate consonant reflects the fact that English has no 
geminate consonant phonemes. 

The <ss> sequence has /s/ as its ‘basic’ value, as in kiss and 
massive, but the kinds of palatalization process we discussed 
above have yielded /J7 in words which have had a yod 
historically, like mission and assure. 

11.3.3 Unpronounced Consonant 
Graphemes 

Some word-initial consonants have been elided as English has 
evolved. These include /k/ and /g/ in branching onsets with /n/, 
so that the following digraphs no longer correspond to 
sequences of onset phonemes: 

(16) 

Grapheme 

Phoneme 

Example 

<kn> 

/n/ 

know 
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<gn> 

Ini 

gnome 

Other words were borrowed from Greek, whose phonotactic 
constraints allow onset sequences such as /ps/ and /pt/. Since 
English phonotactics do not allow such sequences in onsets, 
the following correspondences have arisen: 

( 17 ) 

Grapheme 

Phoneme 

Example 

<ps> 

Is/ 

psyche 

<pt> 

It/ 

Ptolemy 

Other graphemic sequences which reflect historical process 
of elision include the following: 

( 18 ) 

Grapheme 

Phoneme 

Example 

<wr> 

111 

write 

<wh> 

/w/, /h/ 

whine, whole 

Old English had an onset sequence /hw/ in words like whole. 
In RP and GA, the /h/ has since elided, while the spelling still 
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indicates a cluster of two phonemes. In SSE, and in some 
varieties of American English, the /hw/ sequence has become 
[ay], a voiceless fricative which has both a bilabial and a velar 
articulation, rather like the sound one makes when blowing out 
a candle. Pairs such as which and witch are minimal pairs in 
SSE: / AYitJ7 vs AvitJ7, whereas they are homophones in RP and 
GA. 

There are also word-final digraphs which correspond to a 
single phoneme, since the word-final consonant has elided 
during the evolution of the language: 

(19) 

Grapheme 

Phoneme 

Example 

<mn> 

/m/ 

hymn 

<mb> 

/m/ 

comb 

<gn> 

/n/ 

sign 

<gm> 

Iml 

paradigm 

In morphologically related words, the root-final consonant is 
no longer word-final, and has therefore often been retained, as 
in the following: 

( 20 ) 

Spelling 
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Pronunciation 

hymnal 

['himnal] 

signatory 

[' signotii] 

paradigmatic 

[.paejadig'msetik] 

In word-medial position, <t> is often not pronounced in the 
following sequences: <st> followed by a vowel grapheme, as in 
christen, and <stl>, as in castle. 

11.3.4 Graphophonemics and 
Contrastive Phonemics 

Some learners of English as a foreign language encounter 
problems which relate both to the English graphophonemic 
correspondences and to the difference between the phoneme 
system of their native language and that of English, especially 
RP English. An example concerns mid vowels. It is common to 
find languages with a core monoph-thong system like the 
following: 

( 21 ) 

N 

/u/ 

/e/ 

lot 

Id 

hi 

Id 

By core monophthong system is meant a system ignoring, 
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for instance, nasalized monophthongs such as the French 
phonemes /&/, ZoJ and /5/, or front rounded vowels, such as the 
French phonemes /y/, /o/ and /ce/, all of which are said by 
linguists to be ‘marked’, i.e. relatively uncommon in the 
world’s languages. 

In the above diagram, there are two high vowels (/i/ and /u/), 
two high-mid vowels (/e/ and /o/), two low-mid vowels (/s/ and 
hi), plus a single low vowel which is neither fully front nor 
fully back, which is why we have chosen the /e/ 
representation. Examples of languages with such a system are 
the Romance languages French, Spanish and Italian. If one 
speaks a language with this kind of core system, the vowel 
phoneme system of RP is quite a challenge. To begin with, it 
has two high back rounded vowels (/u:/ and /u/), rather than 
one, which is the norm in the world’s languages. Its also has 
two ‘a’-type vowels (/as/ and /a:/), or even three , if we 
consider that the vowel /a/ is rather like an ‘a’ sound, and thus 
often pronounced with an [b] in languages which lack an /as/-/ 
a/-/cl:/ contrast. The back mid vowels of RP are especially 
difficult, since RP has three ‘o’-type phonemes: /so/ (which 
used to be monophthongal /o:/ in Middle English), lon^tense / 
o:/ and short/lax /a/. For speakers whose native language has 
only two ‘o’-type phonemes, a phonological difficulty arises: 
how to expand the two perceptual categories into three. 

The phoneme system of our native language, which we have 
in our heads, is a system of perceptual categories, via which 
we decode speech. It is difficult to perceive a three-way 
contrast in a foreign language if it corresponds to a two-way 
contrast in our native language. For instance, English has a 
two-way contrast between alveolar and palato-alveolar 
fricatives: /s/-/J7 and If an English speaker tries to learn 
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Polish, a difficulty arises with the fricatives, since Polish has a 
three- way contrast between alveolar, palato-alveolar and pre¬ 
palatal fricatives, transcribed as /s/, /J7 and Is/. Because we 
English speakers are so accustomed to perceiving only two 
categories of voiceless fricative in this region of the oral cavity, 
we have difficulty with both perceiving and producing the 
three-way Polish contrast. 

For speakers of many languages who are learning English, 
the same difficulty applies to the RP contrast between /oo/, h:/ 
and Id. Standard Scottish English would be easier for such 
speakers, since its core monophthong system is very similar to 
the one depicted above: only one ‘u’-type sound, a two-way 
Id-Id distinction, a twoway Id-Id distinction and only one ‘a’ 
phoneme (though SSE does have the Id phoneme). The 
phonological difficulty with ‘o’ sounds is compounded by the 
vagaries of English spelling since, as we have seen, there are 
various ways of spelling the three different ‘o’ sounds. We 
have seen that the <o> grapheme can correspond to either hot, 
hJ or Id. However, there is some light at the end of the tunnel 
for non-native speakers learning RP: we have cited 
graphophonemic regularities above which determine, to a large 
extent, which values we find. In addition, the sequences 
<aught>, <ought> and <aw> systematically correspond to the 

RP long vowel hJ, as in caught, sought and lawn, and not to 
RP lool. This is a point worth bearing in mind for learners of 
English, since there are RP minimal pairs such as loan and 
lawn, which have, respectively, /sol and hJ. An anecdote 
might serve to highlight the problem: when we first bought a 
house in France, a French colleague asked me a question about 
it. I thought the question was ‘Do you have a loan?’, to which 
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I answered ‘Yes, of course: we’re not wealthy enough to have 
paid cash for the house.’ What he meant was ‘Do you have a 
lawn?’, but he produced /oo/ for the <aw> sequence in lawn, 
rather than the hJ pronunciation. 

To conclude on English spelling-to-pronunciation 
correspondences: the regularities we have given above show 
that these are not completely chaotic. There are, though, many 
exceptions to the ‘rules’ we have provided, and those make 
English graphophonemics rather more messy than it might 
otherwise have been. We have seen, for instance, that words 
such as panel really ought to be written pannel, as in channel, 
with a double consonant grapheme, and my own name, Philip, 
ought to be written Phillip, to indicate the checked value ([i]) 
of the pre-final stressed vowel. However, it is best, perhaps, to 
emphasize the regularities, rather than make a very long list of 
exceptions: one cannot, after all, have irregular forms unless 
there are regular forms. 


Exercises 

0 

Listen to sound 
files online 

1 Listen to Track _ 11.1 at 

www. wilev. com/ go/carrphonetics . made by an RP 
speaker. For each of the words on the recording (listed 
below), say whether the vowel corresponds to the free or 
the checked value of the vowel grapheme. Explain why. 
Provide the symbol, in slanted brackets, for each vowel 
phoneme. 
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(a) dam 

(b) dame 

(c) den 

(d) dene 

(e) Tim 

(f) time 

(g) dot 

(h) dote 

(i) dun 

(j) dune 



Listen to sound 
files online 

2 Listen to Track 11.2 . made by an RP speaker. For the 
stressed syllable in each of the words on the recording, 
say whether the vowel grapheme corresponds to the free 
or checked value and explain how the spelling encodes 
those values. 

(a) taper 

(b) tapper 

(c) Peter 

(d) petter 

(e) pining 

(f) pinning 

(g) doting 

(h) dotting 

(i) astuter 

(j) a stutter 

0 
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3 Listen to Track 11.3 . made by an RP speaker. For the 
stressed syllable in each of the following words, say 
whether the vowel grapheme corresponds to the free or 
checked value and explain how this relates to the way the 
word is spelled. 

(a) sanity 

(b) episode 

(c) citadel 

(d) opera 

0 

4 Listen to Track 11.4 . made by an RP speaker. Which 
value does the <gh> digraph correspond to in each of the 
following words? 

(a) rough 

(b) through 

(c) ghastly 
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12 


Variation in English Accents 


12.1 Introduction 

In this chapter, we will consider some general aspects of accent 
variation. In chapter 13, a brief overview is given of several 
accents of English: London English, Tyneside English, 
Standard Scottish English (SSE), New York City English, 
Texan English, Australian English and Indian English, followed 
by an outline of the sorts of phenomena which give rise to 
divergence of accents over time. 

Three of the accents we have referred to in this book (GA, 
RP and SSE) are viewed socially as ‘standard’ accents. The 
notion ‘standard’ is a social one: no linguist would claim that 
there is any coherent notion of inherent phonetic or 
phonological superiority, since such a notion simply does not 
make any phonetic or phonological sense. There can be no 
doubt that many people judge some accents to be superior to 
others, or take some accents to be standard accents and others 
to be non-standard accents. But those judgements are founded 
on non-linguistic factors, to do with social attitudes in the 
societies in question. From a strictly linguistic point of view, 
such judgements are, quite clearly, entirely arbitrary. For 
example, RP, the standard accent in England, is non-rhotic, and 
the non-rhoticity of RP is therefore judged by some (perhaps 
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many) English people to be more prestigious than the rhotic 
accents found in many of the Western parts of England. But 
the standard accents SSE and GA are rhotic, and in the United 
States, it is the rhotic accents which are often judged to be 
more prestigious than the non-rhotic American accents (the 
judgement cannot arise in Scotland, where all native accents 
are rhotic). 

Clearly, it is social attitudes which determine such 
judgements about accents, rather than the phonetic and 
phonological properties of the accents themselves. It is 
common to find social judgements to the effect that some 
accents are ‘uglier’ or ‘harsher’ than others. These judgements 
too are entirely arbitrary as far as phonetics and phonology are 
concerned. For instance, if the word say, pronounced as [sai] 
in London English, is judged ‘ugly’ by an RP speaker, then the 
RP speaker’s own pronunciation of the word sigh, as [sai], 
ought also to strike the RP speaker as ‘ugly’. Cases such as 
this show that it simply cannot be any phonetic properties of 
the sounds [sai] in and of themselves which induce the 
aesthetic judgement. Rather, such judgements also derive from 
and reflect social attitudes about what might, rather broadly, be 
called ‘ways of life’. In Britain, most non-standard accents 
which are judged ‘ugly’ or ‘uncivilized’ are spoken in industrial 
or post-industrial urban areas; examples often cited are the 
working-class accents of London, Birmingham Liverpool, 
Belfast, Glasgow and Tyneside. Similar sorts of judgement are 
made in the United States, with respect to, for example, the 
broad New York City accent often referred to as ‘the Brooklyn 
accent’ (though it is not confined to the Brooklyn district of 
New York City). 

Non-standard rural accents are, by contrast, often judged 
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‘quaint’, rather than ‘ugly’; examples are accents from the 
Highlands of Scotland, from the West Country in England, or 
from the US Southern states. It is highly likely that these 
aesthetic judgements arise from the conscious or unconscious 
association of accents with the real or imagined ways of life of 
those who speak them. We will therefore stand back from such 
social attitudes and attempt to examine accent variation from 
the point of view of the phonetician and the phonologist.- 

12.2 Systemic vs Realizational 
Differences between Accents 

Let us begin by considering one of the differences between 
many accents in the North of England and many of those 
spoken in the South of England. In the latter accents, there is a 
phonological contrast between /o/ and /a/, which can be 
observed in pairs such as book/buck, rook/ruck, put/putt and 
many others. That distinction is missing in many Northern 
English accents,= which have /u/ in each member of the pair. 
That is, many Northern English accents simply lack the /a J 
phoneme, and thus the /u/ vs /a/ distinction: for them, pairs 
such as put and putt are homophones, not minimal pairs. We 

will refer to this sort of difference as a systemic difference- 
between two accents: the set of phonological contrasts 
(specifically, in this case, the vowel contrasts) of the speakers 
differ. 

Systemic differences are widely attested. In many Scottish 
English and Scots accents, for instance, there is no equivalent 
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of the /as/ vs /a:/ distinction, of the sort found in RP minimal 
pairs such as ant/aunt, palm/Pam, etc. In Scottish English, 
each member of such pairs contains the same vowel phoneme, 
/b/, which is realized as [b], Again, pairs which count as 
minimal pairs in non-Scottish accents are homophonous in 
Scottish English. Similarly, as we have seen, while RP has a 
three-way distinction between [d], [a:] and [o:], many speakers 
of GA have only a two-way distinction between [o:] and /a/. 
This systemic difference means that there is a difference in the 
sets of words which are distinguished by means of these 
phonemes, as follows: 

( 1 ) 

Words of the type 

RP 

GA 

palm 

/a:/ 

/a/ 

caught 

hJ 

h:/ or /a/ 
cot 

/d/ 

/a/ 

coffee 

/d/ 

hJ 

Systemic differences are not restricted to vowel systems. An 
example of a systemic difference in consonant systems is the 
contrast, found in many Scottish accents, between /ay/ and /w/, 
which is found in minimal pairs such as whales vs Wales, whin 
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vs win, what vs watt, etc. That contrast is absent in most non- 
Scottish accents, which have a /w/ phoneme, but not a /W 
phoneme. In those accents, the above pairs are homophones, 
rather than minimal pairs. 

There are differences between accents which do not amount 
to a difference in the systems of phonological contrasts. 
Consider our discussion of dark and non-dark IV in chapter 7: 
we said there (7.6) that there is an allophonic rule in RP, to the 
effect that IV is realized as a velarized (‘dark’) lateral when it 
occurs in rhymes, but is realized as a non-velarized (‘non- 
dark’) lateral when it occurs in onsets. In many accents of the 
South of Scotland, in Australian English, in GA and in some 
accents of the North of England, IV is realized as a dark lateral 
in all positions, so that /1 a1/ {lull), for instance, is realized as 
[IaI], rather than [IaI], In Tyneside English, on the other hand, 
IV is consistently realized as a clear lateral in all positions, so 
that /IaI/ (lull)- is realized as [Pol']. There is no question of 
postulating, in these sorts of case, a difference in the 
underlying system of contrasts: all three accents have a 
contrast between IV and other consonant phonemes, such as 
/r/, but there is variation in the way that !V is realized. While 
there are languages in which the distinction between clear and 
dark laterals is contrastive, there can be no question of 
postulating such a phonological contrast in any of these accents 
of English: in the Southern Scottish and Australian cases, there 
are only dark laterals; in the Tyneside case, there are only clear 
laterals; and in RP and GA, while there are dark and non-dark 
laterals, the distinction is purely allophonic: the distinction lies 
at the level of realizations of a single phoneme. We will 
therefore refer to these sorts of difference as realizational 
differences. 
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Realizational differences involving vowels are very common. 
For instance, one of the differences between SSE and most 
non-Scottish accents is the presence of allophonic vowel length 
in SSE. This is absent in most non-Scottish accents, but this 
difference is a purely realizational matter. Take the contrast 
between /i/ and /u/ in SSE (as in beet, boot ): a parallel contrast 
is also present in most non-Scottish accents, including RP. In 
RP, the equivalent phonemes, /i:/ and /u:/, are, in all contexts, 
typically realized as [i:] and [u:] respectively, as in geese and 
goose. In SSE, however, N is realized as either short [i] (as in 
feet) or long [i:] (as in freeze), and /u/ is realized either as short 
[u] (as in foot) or long [u] (as in lose), depending on the 
phonological context: the long realization occurs word-finally 
and before the voiced segments [j], [z], [v], [6] and [3]; the 
short realization occurs elsewhere. 

Realizational differences can become systemic differences 
over time. For instance, at a stage in the history of RP when it 
was still a rhotic accent, the /i:/ phoneme was realized as [i:o] 
before 111 in coda position, in much the same way as /i:/ is 
currently realized as [i:o] before IV in coda position. At that 
stage in the history of RP, pairs such as feared/feed and 
beard/bead differed in two respects: one had [j], while the 
other did not, and one had [i:], while the other had [i:o]. With 
the gradual loss of the [j] articulation in coda position, pairs 
like this became minimal pairs: [bi:od] ( beard) vs [bi:d] (bead). 
It is reasonable to say that, at that stage, a new /i:o/ phoneme 
(the ancestor of present-day /io/) emerged. This process is 
known as a phonemic split: a distinction which was once 
allophonic becomes phonemic. In terms of differences between 
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accents, we want to say that, at a point when both RP and, 
say, SSE were rhotic, they both had the /i:/ phoneme, but there 
was a realizational difference, in that RP, but not SSE, had an 
[i:o] allophone of the Id phoneme. Now, however, there is a 
systemic difference: RP has an Id vs /io/ contrast which SSE 
lacks. 

Another example of a phonemic split which occurred in the 
history of RP and many other varieties of English is the 
FOOT/STRUT split, mentioned above. The STRUT phoneme 
evolved historically from the FOOT phoneme: at one stage in 
the history of English, all realizations of the FOOT vowel were 
rounded. But unrounded realizations began to appear, and 
these eventually took on phonemic status, resulting in the 
emergence of the STRUT vowel, pronounced [a] in RP and 
many other varieties of English. Many accents in the North of 
England failed to undergo the FOOT/STRUT split, so that 
words like strut are pronounced with an [u]. The difference 
between these North of England accents and RP is therefore a 
systemic difference: RP possesses a phonemic distinction 
which is absent in those Northern varieties. As a result of this, 
pairs of words which are minimal pairs in RP, such as put 
([p h ut]) and putt ([p h At]), are homophones in the Northern 
varieties: both put and putt are pronounced [p h ut]. 

How can one tell, for a given difference, whether it is 
systemic or realizational? Let us examine a particular case, that 
of London English. By ‘London English’ (henceforth, LE), we 
do not mean the accent spoken by all natives of London; 
rather, we refer, albeit in a necessarily oversimplified way, to 
the speech of working-class natives of London.- It has been 
widely noted that speakers of LE often (but variably) utter 
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words such as lay, pay, say with an [ai] diphthong, and we 
know that, in RP, these have an [ei] diphthong. Is this a 
systemic or a realizational difference? 

The answer is that we cannot tell on the basis of this 
evidence alone. In order to establish whether there is a 
systemic difference between the accents, we must consider the 
system of contrasts in each accent. Specifically, we must ask: 
is there a phonetic [ei]/[ai] distinction in either accent, and if 
so, is it contrastive? We have already established that, in RP, 
there is such a distinction, and that it is contrastive (cf. the 
minimal pairs bay vs buy, Toy vs tie, say vs sigh , etc.). What 
we must then establish is whether these pairs are also minimal 
pairs in LE: if they were to turn out to be homophones in LE, 
then we could, all things being equal, reasonably conclude that 
there is a systemic difference here, in just the same way that 
we concluded, on the basis of evidence from minimal pairs and 
homophones, that RP has an /u/ vs /a/ contrast, which many 
Northern English accents (with /u/ alone) lack. 

What we find is that these are indeed minimal pairs in LE: 
while bay, Toy, say , etc. have an [ai] diphthong in LE, buy, tie, 
sigh, etc. have an [oi] diphthong. In the absence of any further 
evidence to the contrary, we may conclude that there is no 
systemic difference here: the contrast which we found in RP is 
maintained in LE. 

But the matter does not end there, since we also know that 
RP has a contrast between /ai/ and /oi/, as in buy, tie, sigh vs 
boy, toy, soy. What we must now ask is whether this contrast 
is also sustained in LE, or whether these RP minimal pairs are 
homophones in LE. The answer is that the contrast is sustained 
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in LE: while buy, tie, sigh etc. have [oi], boy, toy, soy, etc. 
have [oi]. 

We have now noted a related set of non-systemic (non- 
contrastive), purely realizational differences between RP and 
LE, which we might conceive of in terms of the articulatory 
and perceptual ‘space’ in which the realizations of a phoneme 
are located. We may depict this in terms of the vowel space 
diagram. Consider the phonetic realizations of the RP vowel 
phonemes /ei/, /ai/ and /oi/. Let us imagine that speakers of RP 
typically have diphthongal realizations of those phonemes 
whose starting points fall within the following sorts of 
articulatory ‘zones’ in the vowel space: 

( 2 ) 



The speaker might well vary in her or his realizations of 
these phonemes. However, so long as the articulation of a 
particular vowel does not encroach upon the space of any of 
the others, bay will still be distinguishable from buy, and buy 
from boy. If, in the course of the historical development of the 
accent, the articulations of /ei/ and /ai/ were to become so 
close as to be perceptually indistinguishable, such that pairs like 
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bay vs buy were both uttered as [bai], then the contrast would 
be lost. That phenomenon, which is widely attested in the 
histories of human languages, is referred to as a phonemic 
merger: where once a phonemic contrast was present, it came 
to be collapsed. - 

Clearly, no merger has occurred in the LE cases cited above: 
while the realization of /ei/ has indeed ‘shifted’ to [ai], the 
realization of [ai] has, in turn, shifted, to [oi], and the contrast 
is thus maintained. Similarly, while the realization of /ai/ has 
shifted to [oi], the realization of /oi/ has, in turn shifted, to [oi], 
and again the contrast has thus been maintained.- This kind of 
phenomenon, which is fairly widespread, is often referred to as 
a vowel shift. A parallel shift is evident in Australian English, 
where the /i:/ phoneme has diphthongized to [n], thus 
encroaching, to some extent, on the space of /ei/. The /ei/ 
phoneme has in turn shifted to [m], with a fairly low, central, 
unrounded starting point, thus encroaching on the perceptual 
space of /ai/. In turn, /ai/ has shifted to [oi], thus encroaching 
on the space of h\L However, the /oi/ phoneme appears not to 
have taken ‘evasive action’, so [oi] and [oi] are contrastive. 
The point to be borne in mind is that vowel shifts are purely 
realizational: they do not involve a change in the number of 
phonological contrasts. 

We asked whether the realization of /ai/ as [oi] resulted in 
the destruction of a contrast which is present in RP. We might 
equally ask whether the realization of the /oi/ phoneme as [oi] 
in LE results in the destruction of a contrast found in RP; the 
answer is that, although RP speakers may well utter both [oi] 
and [oi], the phonetic difference between the two, while 
perceptible, is never contrastive: one cannot cite minimal pairs 
involving the two. So there is no phonemic contrast which 
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would be collapsed by uttering /oi/ as [oi] (the RP speaker is 
free to do so, without risk of conflating a contrast). For the LE 
speaker, while the realization of IqiI has encroached upon the 
‘space’ of /ai/, and the realization of the latter has encroached 
upon the space of /oi/, the realization of the latter has not 
encroached upon the space of any other phoneme. 

12.3 Perceptual and 
Articulatory Space 

The simplest system of vowel phonemes- found in the world’s 
languages is the three-vowel /i/, /u/, /a/ system, typically 
depicted within the vowel space as: 

(3) 



The realizations of the vowel phonemes in a system like this 
can vary considerably; it will matter little in terms of the 
hearer’s identification of a given vowel phoneme if /i/ is 
realized as an [e]-type vowel, so long as it is relatively front, 
unrounded and relatively high. Nor will it matter if /u/ is 
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realized as an [o]-type vowel, so long as it is relatively high, 
relatively back and rounded. Similarly, it will not matter 
whether realizations of /a/ are front (like [a]), back (like [a]), 
or central (like [n]), so long as they are relatively low and 
unrounded. That is, each vowel phoneme has quite a large 
perceptual and articulatory space. 

A slightly larger vowel system, frequently encountered in the 
world’s languages, has mid vowels: /i/ /u/ 

(4) 



In a system like this, the vowel space is a little more 
crowded: it will matter whether a realization of !M is [e]-like, or 
if a realization of /u/ is [o]-like. But it will not matter if a 
realization of Id is [s]-like, so long as it is not [a]-like. Nor will 
it matter if a realization of /o/ is [o]-like. 

In a slightly larger system, there are contrasts between high- 
mid and low-mid vowels: 

(5) 
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In this kind of system, there is even less articulatory and 
perceptual space for each vowel phoneme: it will matter if the 
realizations of /e / are [e]-like, for instance. 

Many English accents have relatively large vowel phoneme 
systems, often containing, for instance, as many as three or 
four ‘a’-type phonemes, as in RP’s front /as/, central /a/ and 
back /a:/. This means that shifts in realization can easily result 
in one phoneme encroaching upon the space of another. And 
this notion of articulatory and perceptual ‘space’ has a bearing 
on the question of which pairs of vowel phonemes we should 
consider when attempting to establish whether a given vowel 
difference is systemic or not. One way of answering this 
question is to say that one should consider vowel phonemes 
which are in some sense ‘adjacent’ in articulatory and/or 
perceptual terms. This is what we did when we considered the 
[ei] vs [ai] distinction: they are adjacent in that the starting 
points of both diphthongs are front, non-high and unrounded. 
We might equally have considered the [s] vs [ei] distinction, 
since [e], the starting point for the latter, is close to [s] in 
articulatory terms. Indeed, one finds that the Id phoneme is 
often realized as an [ei] diphthong in LE (thus, well can be 
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pronounced as [weiw]). This, however, does not cause merger 
of the Id vs /ei/ contrast since, as we have seen, /ei/ is realized 
as [ai]. 

The main difficulty one encounters in considering adjacent 
vowel sounds is that a given vowel can be adjacent to many 
(perhaps most) of the other vowel sounds found in a particular 
accent. As we have just seen, [ei] is adjacent to both [ai] and 
[s]. It is also adjacent to [3:], since the starting point for [ei] is 
front, mid and unrounded, and [3:] is central, mid and 
unrounded. As it happens, [3:] has not shifted in LE, but it has 
in other accents. In the Liveipool accent, for instance, it has 
fronted to [e:], so that work and bird , which are pronounced 
[w3:k] and [b3:d] in RP, are pronounced [ws:x] and [bs:d] in 
Liverpool. This means that the realization of [3:] has 
encroached upon the articulatory and perceptual space of the 
adjacent vowel phoneme Id. However, the phonemic 
distinction between [3:] and Id is maintained in Liveipool, 
since the former is long with respect to the latter, as in bird vs 
bed ([bs:d] vs [bed]) and thus distinguishable from it. Lor 
many Tyneside speakers, [3:] has been retracted and rounded 
in some words, thus encroaching on the space of [0:]. Thus 
work is pronounced [wo:k] by many speakers. This has 
resulted in a loss of the [3:] vs [0:] contrast in a particular set of 
words, so that pairs such as walk/work and bird/bored are 
homophones rather than minimal pairs. 

It appears that vowel articulations are especially susceptible 
to this ‘shifting around’ phenomenon and it is because of this 
that the majority of differences between accents are 
differences in the articulation of vowels, rather than of 
consonants. This is perhaps because of the nature of vowel 
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articulations, which all have a stricture of open approximation. 
Furthermore, the more open the vowel, the more open the 
approximation, and the less contact there is between the tongue 
and the other parts of the oral cavity. It seems clear that, for 
any given phonetic segment, we are unlikely to hit on exactly 
the same articulation each and every time we attempt it. Not 
every [d] we utter will have exactly the same part of the tongue 
closing against exactly the same part of the alveolar ridge for 
exactly the same length of time with exactly the same amount 
of vocal cord vibration beginning and ending at exactly the 
same time. So variation is inherent in speech. 

But it appears to be even more inherent in vowel 
articulations than it is in, say, stops, since it is much harder to 
feel where one’s tongue is in one’s mouth when one produces 
a vowel than when one produces, say, a stop. One can feel the 
lips close together for the articulation of a bilabial stop, and one 
can feel the tongue against the alveolar ridge when one 
produces an alveolar stop, so that the articulatory difference 
between the two is easily discerned by the speaker. But when 
one produces, say an [e:] as opposed to an [s], it is much 
harder to feel what one’s tongue is doing. One can therefore 
easily ‘overshoot’ or ‘undershoot’ and end up with articulations 
which encroach upon the space of another, adjacent, vowel 
phoneme, \bwel articulations, in short, form a continuum; 
there are no abrupt, discrete divisions between them. 

Parallel to this articulatory continuum, there is a perceptual 
one. Recall that our ‘cardinal vowels’ are merely reference 
points, based on an arbitrary carving up of the available vowel 
space and the available parameters of tongue height, front- 
ness/backness and roundedness, all of which are a matter of 
degree, rather than a matter of absolutes. Once an [e]-type 
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sound begins to lower, at what point does it become an [s]- 
type sound? Once an [as]-type sound begins to raise, at what 
point does it become an [sj-type sound? The answer is that it 
is impossible to say with any certainty. Little wonder, then, that 
when some American English speakers say bad, speakers of 
many accents of British English think that they are saying bed: 
their [se]-type sound has raised to what is perceived as an [e]- 
type sound. We perceive vowels the way we perceive colours. 
If we are presented with an example of a classic example of 
green, we have no difficulty in identifying it as green. The 
same is true for a classic example of blue. But once we are 
presented with a colour that is half-way between the two, we 
often cannot tell whether we think it is greenish-blue or bluish- 
green. In such in-between cases, one person will judge, say an 
item of clothing as green, while another will judge it to be blue. 
So it is with vowels: we find it hard to categorize a vowel half¬ 
way between [e] and [e]. If a vowel that was once close to 
prototypical [e] begins to be produced closer to [e], hearers 
may categorize it as an instance of an [e], and thus may, in 
turn, begin to articulate it as an [e]. 

The most important point is that our perception of acoustic 
events is heavily dependent on the mentally represented system 
of phonological contrasts which we have in our native accent. 
It is also vital that, for a phonemic distinction to exist, it must 
be based on a phonetic distinction which is perceptible to 
human beings. Those phonetic differences can be minute, but 
they must be perceivable. If a language does not have a large 
vowel phoneme inventory, the realizations of each vowel 
phoneme are much more ‘free to roam’ in the available 
articulatory and perceptual space. 

There are consonantal continua too, but there is a little more 
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in the way of discrete divisions among consonants than among 
vowels. A bilabial articulation, for example, is radically distinct 
from an alveolar one, since the tongue is not implicated at all in 
the former, and the lips need not be involved in the latter. But 
even then, we have seen, particularly among consonantal 
articulations which are more vowel-like, that a gradual 
transition from alveolar to labial is indeed possible, as we have 
seen with [w] realizations of /l/. Once a secondary velar 
articulation is present in the pronunciation of an alveolar lateral, 
it can become the primary articulation, the original alveolar 
stricture can be lost altogether, and a new bilabial stricture can 
emerge. 

12.4 Differences in the Lexical 
Distribution of Phonemes 

There is a further kind of variation between accents in which 
the phonological representation of words is concerned, but 
which is not systemic variation. For instance, in many accents 
of the North of England, there is some kind of /as/ vs /a:/ 
distinction, parallel to that found in RP. The actual vowel 
quality of the Tong a’ often differs from that of RP. Many 
West Yorkshire speakers have a much more front articulation 
for Tong a’ than RP speakers; very often the difference is 
between a short [a] and a long [a:], with the vowel qualities 
being the same, and the two vowels differentiated only with 
respect to length. Nonetheless, the phonemic distinction is 
there, and functions as the basis of minimal pairs such as 
ant/aunt and Sam/psalm. We can say that West Yorkshire’s /a/ 
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vs /a:/ distinction is parallel to, or equivalent to, RP’s /se/ vs /a:/ 
distinction. 

But speakers of such Northern English accents (and indeed 
of GA) often utter a short vowel in words which would be 
uttered with a long vowel by the RP speaker, words of the 
lexical set BATH. Examples, for many Northern speakers, are 
bath, class and glass, all ending with a voiceless fricative. 
Other members of the set have a nasal-plus-consonant 
sequence, such as France and dance. What we want to say 
about this sort of difference is not that the Northern English 
speaker lacks the Tong a’ vs ‘short a’ phonemic distinction (in 
the way that Scottish speakers do) but that the phonological 
form of those particular words contains the short, rather than 
the long, phoneme. We can illustrate this as follows: 

(6) Systemic difference vs lexical distribution difference 

Phonemic 

Phonological forms 

Ton^short a’? 

of ant, aunt, bath 

RP speaker: 

Yes 

/sent/, /a:nt/, /ba:9/ 

Northern speaker: 

Yes 

/ant/, /a:nt/, /ba9/ 

Scottish speaker: 

No 

Amt/, Ant/, Ibv.QI 
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Such differences are very susceptible to variation between 
different accents in the range of words which have one 
phoneme rather than the other in their phonological 
representations. They are therefore less general in nature, and 
more idiosyncratic, than systemic and realizational differences; 
however, such differences can cause major problems for 
mutual intelligibility between speakers of different dialects. 

We have now begun to note some differences between 
accents and dialects, in a way which allows us more insight 
into the nature of those differences than merely noting different 
pronunciations, and which shows the extent to which 
theoretical considerations play a part in our analyses. While it is 
informative to note that many speakers in the North of England 
pronounce words like bus as [bos] and that LE speakers 
pronounce words like say as [sai], or to note that some 
speakers utter an [j] in some contexts rather than others, what 
we have done here is to say a little more than just that: we 
have sought to gain some insight into the nature of the 
phonological knowledge possessed by speakers of different 
accents. 


Notes 

1 This is not to deny that social attitudes to accents affect 
accent variation itself. Clearly, if we are investigating the way 
people speak, and if the way people speak is influenced by 
social attitudes to accents, then we are obliged to recognize 
those factors as part of the general picture. 

2 Many, but not all. Some Northern accents have a 
distinction between /u/ and /a/, in which the realization of /a J 
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is an unrounded version of [u], which we might reasonably 
transcribe as [i]. 

3 An alternative term would be ‘contrastive difference’. 

4 Many Tyneside speakers lack the /o/ vs /a/ distinction. 

5 The term ‘working-class’ is vague, and enormously difficult 
to define. We will, nonetheless, assume that it is meaningful 
and does serve to identify what are real, if complex, 
differences in social class (whatever problems may reside in 
the vague notion ‘social class’ itself). The term ‘London’ is 
equally vague; nonetheless, it also serves a useful function, 
since it too allows us to identify a genuine, if hard-to-define, 
geographical, and perhaps cultural, entity. 

6 Clearly, if there is variation between one realization and the 
other, and this is not noted by linguists, then a contrast that 
might have been wrongly taken to have been merged could 
re-emerge. Additionally, a contrast may merge in some 
phonetic environments, but not in others, leaving some 
minimal pairs intact while collapsing others. See 12.4 below 
on lexical incidence. 

7 We are assuming here that LE and RP have a common 
source, and that LE has innovated historically in its 
realizations of these phonemes in particular in a way which 
RP has not. This assumption happens to be justified in this 
case, but it must not be assumed that all cases are like this. 
There are cases in which it is RP which has innovated. Such 
is the case with the /a/ vs /u / contrast, which is a Southern 
innovation: many Northern accents have not undergone that 
innovation. 

8 By ‘vowel system’ here, we mean ‘system of 
monophthongs’, ignoring diphthongs. Generally speaking, 
monophthongs are ‘more basic’ than diphthongs, but 
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diphthongs are common, and the most common in the 
world’s languages are the [ai]-type and [au]-type diphthongs. 


Exercises 

0 

Listen to sound 
fiU-s online 

1 Listen to Tracks 12.1 . 12.2 and 12.3 at 

www.wilev. com/go/carrphonetics . For each sound file, 
answer the following questions: 

(a) Is the speaker rhotic or non-rhotic? (Cf. chapter 
7.) What evidence do you have for your answer? 

(b) Does the speaker have a rounded or an 
unrounded vowel in words of the sort LOT? 

(c) What is the pronunciation of words spelled <wh> 
and <W>? 

(d) Does the speaker exhibit the phenomenon of 
Flapping? (Cf. chapter 9.) One of these speakers is an 
RP speaker, another is a GA speaker and the other is 
an SSE speaker. Which is which? Relate your 
responses to your answers to the questions above. 

2 Where there are differences between the speakers with 
respect to the phenomena in (la-d), say, for each 
difference, whether it is a systemic or a realizational 
difference, and explain why. 
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13 


An Outline of Some Accents of 

English 

13.1 Some British Accents 

13.1.1 London English ( Track 13.4 
at www.wiley.com/go/carrphonetics 
and exercise 1) 


Listen to sound 
files online 

13.1.1.1 Defining the London English Accent 

By ‘London English’ we mean the vernacular, working-class 
London accent spoken in the boroughs of the East End of 
London. There are more and less broad varieties of this accent, 
ranging from Cockney at one end of the spectrum to accents 
which are closer to RP in some respects at the other end. 

13.1.1.2 London English Vowels 
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Let us postulate that the vowel system of LE is exactly parallel 
to that of RP, but with many major realizational differences, 
mostly of a ‘vowel shift’ nature, as discussed earlier in chapter 
12 . 

Realizations of /a/, /as/, Id, toil , /ai/ and /oi/ 

In LE, the /a/ phoneme is often realized as a short [a]-type 
sound. This then encroaches upon the /as/ phoneme, which is 
often realized closer to [s]. That in turn encroaches on the Id 
phoneme, which in turn is often realized as a diphthong close 
to [ei], and that realization in turn encroaches on the perceptual 
space of /ei/. As we have seen, the /ei/, /ai/ and /oi/ phonemes 
also participate in this vowel shift, which we may depict as 
follows: 

/a/ /*/ Id /ei/ /ai/ hi! 

[a] [e] [e.] [a.] [a] [o.] 

Realizations of /au/, IaI, /as/ and la:/ 

Additionally, in the Cockney version of LE, the /au/ 
phoneme is often realized as a long [a:], as in sound ([sa:nd])- 
and pout ([pa:t]). Since this vowel is distinct from the 
realizations of both IaI (as in putt: [pat]) and /as/ (as in pat: 
[pet]), the distinction between the three phonemes is 
preserved. Note too that, since the realization of /a:/ in LE has 
not shifted, there is a clear difference in vowel quality between 
it and the [a:] realization of /au/. 

Realizations of h:l and /ou/ 

/o:/ is typically realized as [oe] in open syllables (as in war: 
[woe]) and [ou] elsewhere (as in short: [fou?]). This means 
that one of the realizations of this phoneme encroaches upon 
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the space of the /ou/ phoneme, leading to the possibility of 
phonemic overlapping. That phoneme in turn does not typically 
have an [ou] realization; rather, it tends to be realized as [ou] 
before a tautosyllabic IV, and [au] elsewhere. 

Phonemic overlapping: /i:/ vs III 

The l\l phoneme is often realized as [ii] before a 
tautosyllabic /l/, as in fill: [fiiw]. 

Similarly, the /i:/ phoneme is often realized as [ii] before a 
tautosyllabic IV, as in fill: [fiiw]. Thus pairs such as these are 
homophones in LE, but are minimal pairs in RP. Rather than 
concluding that LE has undergone a phonemic merger, and 
therefore that this difference is systemic in nature, we will say 
that it is a matter of phonemic overlapping between the two 
phonemes, since, in other contexts, /i:/ retains its [i:] 
realization, and hi retains its [i] realization, as in beat and bit : 
[bi:?] and [bi?]. Note that, when the /l/ at the end of feel and 
fill is syllabified into a following onset, as in feeling and filling, 
the IV is no longer in the same syllable as the preceding vowel, 
which is realized in its normal way, thus: [fi:lin] and [fi:lin]. 

Lowering of word-final schwa 

Words such as cinema and letter have a word-final schwa in 
RP. In LE, this vowel is often lowered to a central [r] sound: 
letter is often pronounced ['1s?r], 


13.1.1.3 London English Consonants 

LE differs from RP both in terms of the realization of 
consonant phonemes and, arguably, in terms of the consonant 
phoneme system. 
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\biceless stop phonemes 

The voiceless stop phonemes /p/, /t/, /k/ are often realized, 
before a primary stressed vowel, with heavy aspiration, and in 
the case of /t/ at least, often with affrication, cup of tea : 
[k h a?pots h ii]. Note too glottal realizations of these stops, as [?], 
in a wider range of contexts than in RP. One such context is 
intervocalically (between vowels), noticeably when the first 
vowel is stressed, as in matter : ['me?n]. Since the first vowel is 
stressed in such words, the glottaling is foot-internal. 

/0/ vs /f/ and /S/ vs /v/ in Cockney 

It has often been noted that RP minimal pairs such as 
thin/fin are homophones for many Cockney speakers, both 
being [fin]. This is referred to informally as ‘TH-Fronting’ by 
Wells (1982; see Suggested Further Reading). Whether /0/ may 
be said to be absent in Cockney depends very much on 
whether [0] is uttered in any contexts by Cockney speakers. If 
one found that [f], rather than [0], invariably turns up in all 
other contexts (e.g. between vowels, as in Cathy, and word- 
finally, as in moth), then one could reasonably conclude that 
Cockney speakers simply lack the contrast, and that there is a 
systemic difference here between Cockney and RP. But many 
speakers are variable with respect to this phenomenon, so we 
cannot conclude that they lack the /0/ phoneme. As far as the 
/v/ vs 16/ distinction is concerned, it is rather difficult to find 
many minimal pairs involving the two (that vs vat and live/lithe 
are examples). However, it has been noted that words such as 
feather and with, which have, respectively, intervocalic and 
word-final 16/ in RP, are often uttered with [v] in Cockney. It is 
not clear, however, that word-initial /6/, as in the, that, there, 
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their, this, etc., is uttered with [v] in Cockney; if such words 
are uttered with [6], then we must say that there is a purely 
realizational difference here, with Cockney /5/ realized as [v] 
intervocalically and word-finally. 

IV vocalization 

The IV phoneme is often realized as a voiced alveolar 
approximant, but it may be overlaid by secondary articulations, 
such as the velarization we find in the rhyme position of the 
syllable in RP, and in both the onset and rhyme positions in 
accents such as Scottish English, GA and Australian English. 
When the secondary articulation becomes dominant, the 
alveolar articulation may be lost, resulting in a vowel-like 
articulation. When lip rounding is added to this articulation, the 
resulting realization sounds very much like a [w] sound, when 
in coda position in the syllable. This can be seen in Cockney 
pronunciations such as girl ([gew]), Bethnal ([bsfnow]) and 
healthy (['ewfi]). This is known as T vocalization’ because the 
IV is realized as a vowel-type sound. Cockney 1-vocalization 
seems to be spreading to other towns in the UK. 

Absence of /h/ in Cockney? 

It has been widely noted that RP minimal pairs such as 
hair/air are homophones in Cockney and perhaps generally in 
LE. Since Cockney appears to lack even word-internal [h], as 
in behold, this looks very much like a systemic difference. 
Further evidence that /h/ is simply absent in Cockney comes 
from two sources. 

Firstly, we know that [sen] (or [on]) is the phonetic form of 
the indefinite article which occurs before vowel-initial words in 
English, as in an ear, an oar, etc., whereas [se] (or [o]) is the 
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form which occurs before consonant-initial words, as in a boat, 
a house, etc. Cockneys select [sen] (or [on]) before words such 
as house, hair, etc., which might be taken to suggest that such 
words are phonologically vowel-initial in Cockney. 

Secondly, evidence from the phenomenon of hyper¬ 
correction is rather telling. When speakers hyper-correct, they 
‘correct’ words (try to make them approximate to what is 
considered the ‘proper’ pronunciation found in a prestige 
accent) which do not require correction. For instance, many 
speakers who would normally utter [in] rather than [ip] for the 
suffix -ing may be nonetheless aware that [irj], rather than [in], 
is the ‘correct’ pronunciation. Such speakers will often 
‘correct’ words such as kicking from [k h ikin] to [k h ikip], but 
may also mistakenly ‘correct’ words such as badminton to 
[bsedmipton]. The significance of this phenomenon is that the 
speaker has [in] as his or her phonological form for the -ing 
morpheme, and overgeneralizes the ‘correction’ of his or her 
realizations to cases which do not even contain the -ing suffix. 
Similarly, speakers of French, who lack an /h/ phoneme, will 
hyper-correct their English, resulting in pronunciations such as 
[hso] for both hair and air. The problem for the French 
speaker is that, in the absence of an /h/ phoneme, she or he is 
not to know which of her or his /h/-less mental representations 
should have an /h/ added and which not. Cockney speakers 
have been observed to behave in just the same way as French 
speakers, hyper-correcting air to [hso], ear to [hio] and so on. 
This strongly suggests that Cockney speakers, like French 
speakers, simply do not have an /h/ phoneme. However, this 
phenomenon is perhaps in decline in London English, 
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suggesting that awareness of the /h/ phoneme has always been 
present among LE speakers. 

0 

13.1.2 Tyneside English ( Track 13.5 
and exercise 2) 

13.1.2.1 Defining the Tyneside English Accent 

By ‘the Tyneside English accent’ (otherwise known as ‘the 
Geordie accent’), we mean the accent spoken by the natives of 
the urban areas to the north and south of the last few miles of 
the River Tyne before it meets the North Sea, including, 
principally, Newcastle upon Tyne to the north of the river and 
Gateshead to the south. 


13.1.2.2 Tyneside English Vowels 

/u / vs /a/ 

Most Tyneside speakers are typically Northern in having no / 
u/ vs /a/ distinction: they have the former phoneme, but not 
the latter. = Accents which have this contrast have undergone 
what is referred to as the FOOT/STRUT split, a historical 
change in which the /u/ phoneme developed unrounded 
realizations which eventually gained phonemic status, so that 
pairs such as putt (/pAt/) and put (/put/) are minimal pairs. In 
North of England accents such as Geordie, these are typically 
homophones, both being pronounced [p h ut]. 

/e/ and /o/ 


326 



The Tyneside equivalents of RP /ei/ and /ou/ are Id and /o/. 
The Tyneside Id phoneme is realized, in the speech of many 
Tyneside speakers, as a long monophthong: [e:]. This 
realization varies with a diphthongal realization ending in 
schwa: [e:o], The Tyneside /o/ phoneme may also be realized 
as a long monophthong by many speakers: [o:]. For some 
speakers the realization is a long monophthong which is a 
fronted [o:], of the [o:] sort. 

hJJzJ and non-rhoticity 

Tyneside is non-rhotic and this has, of course, affected the 
development of the vowel system. Some Tyneside speakers 
lack a contrast between /o:/ and /3:/ in certain words, so that 
pairs such as work and walk are homophones: [wo:k]. Speakers 
with a ‘broader’ Tyneside accent maintain the distinction 
between h:l and /3:/ in words like walk and talk, so that h:/ is 
realized as [b:] and /3:/ realized as [o:]: [we:k] (walk) vs [wo:k] 
(work). 

Schwa and non-rhoticity 

Among the centring diphthongs, the Tyneside lid phoneme 
is typically realized as [ia] or [re]. The same kind of effect 
occurs in realizations of /uo/, as in poor : [p h uB], The Tyneside 
led phoneme is typically realized as a long monophthong: [e:]. 

Related to the Tyneside pronunciations of the centring 
diphthongs lid and lud, in Tyneside, the schwa phoneme (Id) 
in word-final position can be rather [a]- or [B]-likc (i.e. a low 
central unrounded vowel), but this depends on the history of 
the word. Where schwa was followed by an /r/ historically, it 
tends to be [a] or [n], as in dresser. 

Low unrounded vowels 
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The Tyneside /as/ phoneme is typically realized as [n], but 
often realized as a long [n:] when it is followed by a voiced 
word-final consonant, as in lad ([Pe:d]), but not in lass ([Pees]). 

/a:/ 

Although Tyneside speakers, like many Northern speakers, 
often have /se/ rather than /a:/ in some words (e.g. bath), this 
reflects neither a systemic nor a realizational difference 
between Tyneside and RP. It is a lexical-distributional 
difference: a matter of which of the two phonemes appears in a 
given word. The words in question are words belonging to 
what Wells (1982) refers to as the lexical set BATH; these 
include words such as bath and class, where the vowel is 
followed by a voiceless fricative, and words such as grant and 
France, where the vowel is followed by a nasal-plus-consonant 
cluster. 

/ai/ realizations 

Although subject to variation (both lexical and 
sociolinguistic), /ai/ is often realized as [ai] or [ei] word-finally 
and before voiced fricatives, but realized with a more central 
starting point, as [ai], elsewhere. The effect is similar to that of 
the Scottish vowel length generalization on /ai/ in Standard 
Scottish English. 


13.1.2.3 Tyneside English Consonants 

/h/ and IV in Tyneside 

Although almost every accent of English allows for non¬ 
realization of /h/ in unstressed words of a functional category 
{he, him, etc.), only some allow for non-realization of /h/ in the 
stressed syllables of words of a lexical category. These accents 
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can be found in many parts of England, but not in Tyneside: 
/h/ is almost always realized in stressed syllables in Tyneside. 

Tyneside /l/ is realized as a ‘clear F in all positions, 
transcribed [P]. The term ‘clear F, in this context, denotes an 
alveolar lateral approximant with a secondary palatal 
articulation, in which the front of the tongue forms an 
articulation with the hard palate. 

Glottal stop and glottalization of /p/, /t/ and /k/ 

The voiceless stops /p/, /t/ and /k/ often undergo 
glottalization between vowels, particularly when the first vowel 
has primary or secondary stress, as in clipper, fitter, hacker. 
The resulting realization can be transcribed as [?p], [?t], [?k]. 
The articulations in the oral cavity occur simultaneously with 
the glottal closure. The [?t] realization of Itl varies with [?r]: a 
globalized tap. The term ‘glottal reinforcement’ is sometimes 
used to denote this kind of articulation. Sonorants may 
intervene between the vowels and the stop, as in grumpy, 
auntie, hankie. 

In the words cited here, we could define the context for this 
glottalization as foot-internal (i.e. between a primary or 
secondary stressed vowel and an unstressed vowel), so that 
/p/, Itl and /k/ are aspirated at the beginning of a stressed 
syllable but globalized if they occur foot-internally. 

Speakers of Tyneside English also exhibit glottaling, in which 
voiceless stops, especially Itl, are realized as the glottal stop 
[?]. This is distinct from what we have called glottalization, 
since the articulation here is a glottal stop with no additional 
closure in the oral cavity. Glottaling is found in many accents 
of English, including RP, but it is perceptually more salient if it 
occurs intervocalically after a stressed vowel, as in words such 
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as butter, which in Tyneside English can be pronounced with 
glottalization (['but?e]) or with glottaling (['bu?e]). Glottaling 
in this position is socially stigmatized in Britain, but it is 
perfectly common in Geordie, London English, working-class 
Scottish English and many other accents of English in the 
British Isles. RP speakers who claim that glottaling is ‘lazy’, 
‘unclear’ or ‘slovenly’ are, amusingly, unaware that they 
themselves utter glottal stops on a vast scale, but often in 
phonological contexts in which they are less salient 
perceptually, as in the case of word-final glottal stops in 
unstressed function words followed by a word-initial 
consonant, in sentences like I didn’t know that she was here, 
where the final /t/ of that is very likely to be uttered as a glottal 
stop, even by RP speakers. Ironically, it is precisely because 
the intervocalic, foot-internal glottal stop in pronunciations of 

words such as butter is so clear (i.e. clearly audible) that it is 
the object of complaints by those who say it is unclear. 

The ‘r’ realizations of /t/ 

Tyneside, like many North of England accents, has a 
realization of /t/ which is either an [i]-typc or an [r]-type 
articulation. The phenomenon is known informally as ‘T-to-R’. 
It is not entirely clear whether there is a stateable phonological 
context in which this occurs, or whether there is simply a stock 
of words, or even phrases, in which it typically occurs. The 
phenomenon is probably sociolinguistically variable. The 
realization is reminiscent of Flapping in GA, in that it seems to 
occur inter-vocalically, but the Tyneside phenomenon is 
lexically much more sporadic. Typical cases seem to involve a 
word-final It/ preceded by a short vowel and followed by a 
word beginning with a vowel, as in got a light, get off, put it 
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down, but he does, shut up. It can, however, occur word- 
internally, as in better, and after long vowels, as in I thought he 
did. The ‘r’ realization also seems to vary with the glottal and 
globalized realizations, so that a given pronunciation of, say, 
get off can have [t], [?], [?t], [?r] [r] or [i] as the realization of 
the /t/ in get. 

0 

13.1.3 Standard Scottish English 
( Track 13.6 and exercise 3) 

13.1.3.1 Defining the Standard Scottish English 
Accent 

Standard Scottish English (SSE) is the standard accent which 
many Scots speak when speaking the Standard English dialect. 
It is characteristic of university-educated, middle-class Scottish 
speakers. It is distinct from what is referred to as Scots, which 
is derived from the Northumbrian dialect of Old English. Scots 
is associated with working-class speakers in Scotland; 
representations of it can be found in the novel Trainspotting, 
written by Irvine Welsh, and in the film of that name. Many 
Scottish speakers can speak Standard English without any trace 
of Scots, but they can also mix Scots words, such as bairn 
(child), lug (ear) and kirk (church), into their Standard English. 


331 



13.1.3.2 SSE Vowels 

The Scottish Vowel Length Rule 
A major characteristic of SSE is that there is little evidence 
of phonemic vowel length: pairs such as /u:/ vs /u/, /a:/ vs /se/ 
and hi vs /t>/ do not form part of the vowel phoneme system. 
But there is considerable evidence for allophonic vowel length. 
As we have seen (p. 142), some of the vowel phonemes have 
long allophones morphemc-finally or before voiced 
continuants, yielding long/short allophonic differences such as 
[lif]/[li:v] ( leaf/leave ) and [huf]/[mu:v] ( hoof/move ). This 
phenomenon is known as the Scottish \bwel Length Rule 
(SVLR), a major realizational difference between SSE and RP. 
Which set of vowels undergoes the SVLR is a matter of 
debate; but /i/, /u/ and /ai/ seem to undergo it for nearly all SSE 
speakers. The two realizations of /ai/ are [he], as in eye and 
wise, and [Ai], as in right and ice. While /ai/ has these two 
allophones in exactly the same contexts as the long and short 
allophones of /i/ and /u/, the difference between the ‘short’ 
([vi]) and ‘long’ ([ee]) allophones seems to be a matter of 
vowel quality, rather than quantity. 

The /u/ vs /u:/ and /u/ vs /a/ distinctions 
One of the major systemic differences between RP and SSE 
is that SSE does not have the pool/pull (or /u:/ vs /u/) type of 
distinction. Since SSE does not have phonemically long 
vowels, there is no /u:/ or /u/; instead, there is a single 
phoneme: /u/. This is realized as long [u:] in the SVLR 
contexts, and realized as short [u] elsewhere. In Scottish 
accents other than SSE, the realization of this vowel can be 
even more fronted than the high central [u], sometimes 
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approaching a French-type [y] sound (a high front rounded 
vowel). 

SSE, un lik e many North of England accents, does have the 
put/putt (/u/ vs /a/ in RP) type of distinction. The words put, 
putt, pool and pull are therefore realized as [p h ut] (put), [p h At] 
(putt) and [p h ul] (pull/pool), with words like poor realized as 
[p h un], with a long vowel, triggered by the SVLR. 

Absence of the hJ vs /t>/ contrast 

Another major systemic difference between RP and SSE lies 
in the fact that the h:l vs /d/ contrast is missing in SSE, which 
has instead a single phoneme: hi. This means that, for many 
SSE speakers, RP minimal pairs such as cot/caught and 
not/nought are homophones. The picture is complicated by the 
fact that many SSE speakers have borrowed the hJ vs /d/ 
contrast from Anglo-English and thus have /o:/ in words such 
as nought and caught, but hi in words such as not and cot. 

RP has the hJ phoneme in what Wells (1982) refers to as 
the lexical sets THOUGHT, NORTH and FORCE, but the 
phoneme /t>/ in words of the lexical set LOT. We have seen 
that SSE speakers whose speech has not been influenced by 
RP do not have this hJ vs ho / contrast, so that caught (which 
belongs to the set THOUGHT) and cot (which belongs to the 
set LOT) are homophones. Another difference between RP 
and SSE with respect to these lexical sets is that SSE has the 
/o/ vowel in words of the set FORCE, but the hi vowel in 
words of the set NORTH. As a result, in SSE there are pairs 
such as horse (which belongs to the set NORTH) and hoarse 
(which belongs to the set FORCE) which are minimal pairs: 
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horse is pronounced [hois], while hoarse is pronounced [hoi; 
in RP, these are homophones, both being pronounced [ho:s]. 

Absence of the /as/ vs /a:/ distinction 

Another striking systemic difference between RP and SSE 
lies in the fact that SSE does not have the /se/ vs /a:/ 
distinction. Instead, it has a single ‘a’ phoneme, realized as a 
low unrounded central vowel, [n], so that pairs such as ant and 
aunt are homophones in SSE: [ent]. These are, of course, 
distinct in RP: ant is pronounced [sent] in RP, while aunt is 
pronounced [n:nt]. Some educated SSE speakers have 
borrowed this contrast from RP, but are often variable in 
whether they produce it or not. 

The /ei/ and /ou/-type phonemes 

The SSE equivalent of the RP IqiI phoneme is Id, said by 
some to be realized as long monophthongal [e:] in the SVLR 
contexts, and as short [e] elsewhere, as in [bet] (bait) and 
[be:i] (bare). 

The SSE equivalent of the RP /ou/ phoneme is /o/, said by 
some to be realized as long monophthongal [o:] in the SVLR 
contexts, and short [o] elsewhere, as in [bot] (boat) and [bo:i] 
(boar/bore). 

The diphthongs 

As we have seen, the SSE /ai/ diphthong undergoes the 
SVLR, being realized as [eei] in the vowel length contexts and 
[xi] elsewhere. The SSE /au/ diphthong is realized as [au], as 
in mouth: [mvuG]. 

Id, IaI and Id 

Many words which, in RP, have word-final schwa and did 
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not historically end in an hi, such as cinema, comma, America, 
are uttered with an [a] -type vowel in SSE. Where RP word- 
final schwa was historically followed by an 111, SSE retains the 
111 and has either a schwa or an III, as in better and seller. 

The short [i] vowel found, in RP and many other accents, at 
the ends of words such as very, happy, lucky is usually an [e] 
in SSE. This vowel used to be [i] in RP, but has undergone 
what Wells (1982) refers to as happY Tensing, in which the [i] 
has been raised, or tensed, to become [i]. Many varieties of 
English exhibit happY Tensing, but not SSE. 

13.1.3.3 SSE Consonants 

Rhoticity 

SSE is rhotic; the SSE 111 phoneme is realized in all syllabic 
positions, including the coda position, so that words like far 
and farm are always pronounced with an ‘r\ The phoneme is 
typically realized as [i], sometimes as [r], and very rarely as 
[r]. Some speakers exhibit an allophonic distinction between 
the [i] and [r] realizations, with the tap being realized in 
branching onsets, in words such as bring, trip and creep. Note 
that, when we speak of a rhotic accent, we are not referring to 
the kind of ‘r’ sound a speaker has: we are denoting an accent 
in which the ‘r’, whatever its phonetic form might be, is 
realized in coda position. 

The /ay/ vs Iwl distinction 

The /ay/ vs Iwl distinction, as in witch/which, weals/wheels 
and watt/what, as the spelling suggests, is a distinction which 
has been largely lost in RP, but is still present in SSE and some 
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American accents. The /a \/ phoneme is realized as a voiceless 
bilabial fricative, with a secondary velar articulation. 

The /h/ vs /x/ vs /k/ distinctions 

A major systemic difference between RP and SSE lies in the 
fact that SSE has retained the phonemic distinction between /k/ 
and /x/. The /x/ phoneme is realized as a voiceless velar 
fricative ([x]) in rhymes, after low vowels and back vowels, as 
in loch (lake). It occurs in many Scottish place-names, such as 
Auchertmuchty and Lochalsh. A fronted allophone ([ 9 ]) may 
occur after high front vowels, but this tends to be restricted to 
certain words from Scots which have been incorporated into 
the speech of SSE speakers; an example is dreich, pronounced 
[dii 9 ] or [drig], a word used to refer to cold, grey and wet 
weather. As with /a sJ, the /xf phoneme has been lost in RP; RP 
speakers often utter [k] instead in words like loch. Un lik e 
speakers of some accents in England, SSE speakers do not 
elide the /h/ phoneme before stressed vowels. 

‘Dark F 

The SSE /]/ phoneme is realized as a ‘dark F, i.e. [1], in all 
contexts. Recall that ‘dark F is an informal term for an alveolar 
lateral approximant with a secondary articulation of 
velarization. Recall too that the ‘dark F realization only occurs 
in the rhyme of the syllable in RP, so that the realizations of 
the two IV phonemes in a word such as lull are distinct in RP: 
[IaI], In SSE, this word is pronounced [IaI]. 

13.2 Two American Accents 
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13.2.1 New York City English 


13.2.1.1 Defining the New York City English 
Accent 

The New York City English accent is fairly sharply defined in 
geographical terms, being largely confined to the boroughs of 
New York City (henceforth: New York). There is, however, 
considerable socially determined variation within New York, 
and this variation has been the subject of a good deal of 
sociolinguistic study. The New York accent is widely 
recognized in the United States and, like many urban accents, 
evokes mainly negative reactions. One of the questions 
addressed in the sociolinguistic studies conducted in New York 
is that of rhoticity. It seems clear that the accent has made the 
historical transition from rhotic to non-rhotic, since it has (most 
of) the set of centring diphthongs ending in schwa which are 
characteristic of nonrhotic accents. However, there is 
considerable sociolinguistic variation with respect to rhoticity, 
and it appears that [j] in coda position is staging a comeback. 
Recall that the standard accent in the United States, General 
American, is rhotic, and that rhoticity is regarded as more 
prestigious than non-rhoticity. 


13.2.1.2 New York City Vowels 

The [ 31 ] vowel 

The [ 31 ] realization of the / 3 I phoneme, as in [h3id] (herd), 
is widely regarded as characteristic of New York speech, and is 
often said to characterize ‘the Brooklyn accent’ (although, as 
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we noted earlier, it is by no means restricted to Brooklyn). 
However, it is highly stigmatized and is probably dying out. 
Thi s realization occurs before a coda consonant, so that it does 
not occur in non-rhotic pronunciations of purr. Some speakers 
also have an [31] realization of the /oi/ phoneme, also before a 
coda consonant, so that some minimal pairs have become 
homophones, as in voice and verse. 

Allophones of the /sc/ phoneme 

There are variable [s:a] and [as:] realizations of /as/ in certain 
environments, namely before a voiceless fricative, voiced stop 
or nasal when they occur in a word-final coda (although /p/ 
behaves variably), as in hash, past, bad, stabs, man and damp, 
but not in pal, with a final /M, or hat, with a final voiceless 
stop. These diphthongal or long realizations are sometimes 
referred to as ‘tense’ realizations of the phoneme. It is not clear 
why these specific consonants in that position should trigger 
this tensing process.- There is considerable sociolinguistic 
variation in the exact phonetic form of the allophones, and, for 
some speakers, the [s:a] realization merges with realizations of 
the /so/ phoneme, collapsing minimal pairs such as bad/bared. 

Realizations of /o/ 

New York speakers often have [00] and [00] realizations of 
the h/ phoneme, as in [p h oo] {paw/pour/pore), and, with an 
even higher starting point, [00], thus creating the possibility of 
partial merger with the /ua/ phoneme. 


13.2.1.3 New York City Consonants 

Rhoticity and non-rhoticity 
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The discussion above shows that New York speech has 
undergone the transition from rhoticity to non-rhoticity, and is 
reverting, for many speakers, to rhoticity. This phenomenon 
may well be resulting in greater occurrence of intrusive [i], 
since a speaker who has been non-rhotic but is attempting to 
be rhotic may well insert intrusive [i]s as part of a general 
strategy to utter [j] where it may otherwise be absent. 

Realizations of /0/ and 16/ 

The phonemes /0/ and /5/ are often realized as either 
affricates ([t0] and [dd]) or dental stops: [t] and [d]; the 
variation is sociolinguistically determined. Note that, for 
speakers who have dental stops, the phonemic distinction 
between alveolar and dental stops is maintained, as in [t h in] 
(thin) vs [t h in] (tin). To many speakers of other varieties of 
English, the distinction may well be difficult to notice. 

Realizations of /t/ 

New York speech, like General American, has Flapping of /t/ 
in intervocalic environments (where the first vowel is stressed), 
but it also has glottal stop realizations of /t/ in coda position on 
a greater scale than in GA. The /t/ phoneme is often heavily 
aspirated, to the point, at times, of being affricated, in syllable- 
initial position, as in [tsin] (tin). 

0 

13.2.2 Texan English ( Track 13.7 
and exercise 4) 

13.2.2.1 Defining the Texan Accent 
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Texas is a vast state, said to be larger than France. It is 
therefore unsurprising that there is much variability within 
Texan English. Many educated Texans speak both a Texan 
variety of English and General American, depending on the 
context of utterance. Texan English is rhotic in many parts of 
the state, but, as one approaches the eastern border with 
Louisiana, variable non-rhoticity can be found. Whether 
varieties of Texan English can be considered Southern varieties 
of US English depends on what one considers to constitute the 
linguistic American South: some dialect maps exclude Texas 
from the South, while others include eastern parts of the state. 
Many of the pronunciation features one encounters in Texas 
can be found in neighbouring states, but there are some 
pronunciation features that are said to be specifically Texan. 

13.2.2.2 Texan Vowels 

The /ai/ phoneme of the lexical set PRICE is often 
monophthongized in Texan speech. The phenomenon, referred 
to informally by Wells (1982) as PRICE smoothing, results in a 
long monophthong, as in price pronounced [p h ua:s]. PRICE 
smoothing can be heard in neighbouring states. 

Monophthongization can also be heard in the Texan 
realization of the h\l phoneme in words of the lexical set 
CHOICE, so that words such as oil are pronounced [o:l], 
though the phoneme is sometimes pronounced with a schwa 
off-glide, as in [ool]. 

In addition to monophthongization, in Texas and in many 
Southern states, there is diphthongization of the phonemes /i/, / 
s/ and /as/, in words of the lexical sets KIT, DRESS and 
TRAP. The diphthongs in question have a schwa off-glide, so 
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that kit is pronounced [k h iot], dress is pronounced [chess] and 
trap is pronounced [tojseop]. One can, at times, hear 
pronunciations of words of the TRAP set with what is either a 
triphthong or a bisyllabic pronunciation, as in [bsejond] for 
band. If such pronunciations have two syllables, then we might 
argue that they contain an [sej] diphthong followed by a schwa 
in the second syllable. 

The Id phoneme in words of the lexical set DRESS also 
undergoes diphthongization to [ei] before the phonemes /J7, /$/ 
and /p/, so that the word special, pronounced [spcijbl], sounds 
just like the word spatial. 

The phonemic distinction between the Id of the lexical set 
DRESS and the hi of the lexical set KIT is neutralized before 
coda /n/ in the speech of many Texans, so that pairs such as 
Ken and kin are both pronounced as [k h in]. Neutralization is, 
as we have seen, defined as the suspension of a phonemic 
contrast in specific phono-logical environments (here, before 
coda Inf): it is not the case that speakers who exhibit this 
neutralization have lost the Id vs l\l contrast altogether. 

So, the Id phoneme has a wide variety of realizations among 
speakers of Texan English, from [so], to [ei], to [a]. 

The /au/ diphthong in words of the set MOTH often has a 
higher starting point, in Texas and in neighbouring states, so 
that the word mouth is pronounced [meuG]. 

The phoneme in words of the lexical set LOT often has a 
diphthongal realization in Texan English, so that the word dog 
is pronounced [doug], often written informally as dawg. Given 
this and the MOUTH vowel, a compound such as cow dogs 
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(dogs for herding cows) can be pronounced as [k h sodoogz]. 
The /u:/ vs /u/ distinction is often neutralized before coda IV 
in Texan English, so that pull and pool are both pronounced 
[p h ul]. 

13.2.2.3 Texan Consonants 

The consonants of Texan English are broadly parallel to those 
of General American, featuring ‘dark P in all contexts, Flapping 
and rhoticity. However, as one approaches the border with 
Louisiana, variable non-rhoticity can be observed, since the 
Southern accents to the east of Texas are largely non-rhotic. 

We saw that SSE speakers have retained the Av/ vs /W 
distinction in pairs such as witch/which and Wales/whales. This 
can also be attested among some speakers of Texan English. 


13.3 Two Southern 
Hemisphere Accents 

13.3.1 Australian English ( Track 
13.8 and exercise 5) 

13.3.1.1 Defining the Australian English Accent 

Descriptions of Australian English often distinguish between 
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three socially defined varieties: Cultivated, General and Broad 
Australian. We do not examine the differences between these, 
which mostly concern vowel articulations. We will, however, 
give a brief overview of General Australian, which is spoken 
throughout Australia. General Australian English pronunciation 
has its origins in the speech of early nineteenth-century 
working-class speakers from the South East of England, and is 
therefore similar to present-day London English in some 
respects. 

13.3.1.2 General Australian Vowels 

The main characterizing properties of General Australian are to 
be found in vowel articulations. The vowel system of General 
Australian is parallel to that of RP, but with many major 
realizational differences, mostly of a ‘vowel shift’ nature, as we 
noted earlier in chapter 12. 

The /i:/, /ei/, /ai/, hi/ vowel shift 

We described this vowel shift in chapter 12. It can be 
depicted as follows: 

/i:/ lei/ /ai/ hit 

("I M to.] [»] 

The /u:/, /oo/, /au/ vowel shift 

Like the high front unrounded phoneme HI, the high back 
rounded vowel phoneme /u:/ has diphthongized, often resulting 
in a diphthong with a high front unrounded starting point and a 
high back unrounded finishing point; we will transcribe this as 
[nu]. This realization potentially encroaches on the space of 
the [ou]-typc realizations of /ou/, which have shifted to [ru], 
with a low central unrounded starting point and a high central 
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rounded end point, thus entering the space of /au/, whose 
realization has shifted to [seo], in which the starting point is 
more front than that of [ru] , and the end point lower and 
further back. This set of shifts can be depicted as follows: 

/u:/ lool /au/ 

[«u] [e»] [a?o] 

The IaI, /as/, Id, hi vowel shift 

A vowel shift has also affected the short vowels /a/, /as/, Id 
and h/, with /a/ being realized as a low front articulation in the 
[a] area, close to the space of the lx/ phoneme, which is 
realized as [ej-like. In turn, Id is realized as [e]-like, and thus 
close to the space of l\l, which is rather [i]-like. This 
articulation is, of course, distinct from that of Id, which, as we 
have seen, is diphthongal. We may depict this set of vowel 
shifts as follows: 

/a/ I eel leJ hi 

[a] [e] [e] [i] 

The /a:/ vs IaI distinction 

The General Australian realization of /a:/, like that of IaI, is 
also fronted, but the distinction between the two is not merged, 
since there is a length difference between them, as in [P h at] 
{putt) vs [P h a:t] {part). 

13.3.1.3 General Australian Consonants 

We have already noted that General Australian, like SSE, has 
‘dark Y in all positions; the precise nature of the ‘darkness’ 
may entail, in both cases, retraction and lowering of the tongue 
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body, rather than velarization as such. 

General Australian is non-rhotic. It also has a process rather 
similar to that of Flapping in North American English: a /t/ will 
often be realized as a voiced articulation between vowels. 

0 

13.3.2 Indian English ( Track 13.9 
and exercise 6) 

13.3.2.1 Defining Indian English 

There are many languages spoken on the Indian subcontinent. 
They are divided into two language families. The languages of 
the North belong to the Indo-European language family. This is 
a vast family of historically related languages. In the nineteenth 
century, linguists were able to show that Sanskrit, an ancient 
language of the North of India, was ultimately related to 
Ancient Greek and to Latin. From this discovery, it was 
possible to show that the languages of the North of India were 
related to many of the languages spoken in Europe, such as the 
Romance languages (including French, Spanish, Portuguese 
and Italian) and the Germanic languages (including English, 
German, Dutch and Swedish). The Indo-European languages 
of the North of India include Hindi, Marathi, Gujerati and 
Punjabi. It is hard to believe, when listening to those languages, 
that they are historically related to English, but they are; it is 
has to be borne in mind that the time-span over which these 
languages have evolved is vast. The languages spoken in the 
South of India belong to an entirely distinct language family, 
known as the Dravidian language family. These include Tamil, 
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Malayalam, Kannada, Telugu and Toda. The languages of 
India, whether Indo-European or Dravidian, are not mutually 
comprehensible. English has therefore taken on the status of a 
lingua franca in India: a language which can be used as a form 
of communication between people whose native languages are 
not mutually comprehensible. Many educated Indians have 
English as a second language, and thus use English as a lingua 
franca. 


13.3.2.2 Indian English Vowels 

The vowels of the sets FACE and GOAT, which are 
diphthongs in RP and GA, are typically realized as the 
monophthongs [e] and [o] in Indian English, so that FACE is 
[fes] and GOAT is [got]. 

Many, but not all, speakers of Indian English are rhotic, and 
this has consequences for the vowel system: as we have seen, 
there are no centring diphthongs of the sort /io/, /so/, /uo/ in 
rhotic accents of English. The /3:/ vowel in words of the lexical 
set NURSE is often not present in either rhotic or non-rhotic 
varieties of Indian English. Even among non-rhotic speakers of 
Indian English, a vowel of the sort [a], of the lexical set 
STRUT, can be found in words of the NURSE set, so that 
words such as merlot (an English word of French origin, 
denoting a grape used to produce a red wine of that name), 
pronounced ['m-nbo] in RP, is pronounced ['mAlo]. 

The RP distinction between the long vowel hi (of the lexical 
sets THOUGHT, NORTH and FORCE) and the short vowel / 
o:/ (of the lexical set LOT) is often absent in varieties of Indian 
English. RP minimal pairs such as awful (['oTol]) and offal 
([’nfol]) are therefore homophones in the speech of many 


346 



speakers of Indian English, both being pronounced [ ofol], with 
a short [o], as in the word-play utterance ‘There’s an awful lot 
of offal being thrown away in British kitchens.’ 

Indian English has happY Tensing, so that the word happy is 
pronounced [hsepi], as in RP. 


13.3.2.3 Indian English Consonants 

Some of the Indo-European languages of the North of India 
have a four-way phonemic contrast between voiceless 
unaspirated stops, voiceless aspirated stops, voiced stops and 
breathy voiced stops. We have already encountered voiceless 
unaspirated stops in English words such as spit, stick and skin. 
We have also encountered aspirated voiceless stops in English 
words such as the [p h ] in pit', the initial stops in the words tip 
and king are also aspirated. Voiced stops occur in words such 
as labour, ladder and wriggle. We have not yet encountered 
breathy voiced stops. In voiced sounds, the vocal cords are 
vibrating close together. In breathy voiced stops, the vocal 
cords are further apart, but still vibrate because of increased 
airflow through the glottis. The bilabial, alveolar and velar 
breathy voiced stops are transcribed as [b], [d] and [g]. They 
occur in Hindi words which feature in Indian cookery, such as 
bhindi (okra), dhania (coriander seeds) and ghee (clarified 
butter). Examples of minimal pairs featuring this four-way 
contrast are [bal] (‘hair’) vs [pal] (‘take care of’) vs [pal] 
(‘knife blade’) vs [bal] (‘forehead’). You can hear these, and 
minimal pairs at other places of articulation, on Track 13.1 . 
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It is common for speakers of Indian English to produce 
words such as pit, tip and king with unaspirated voiceless stops 
in initial position. On the face of it, this is puzzling, since many 
speakers of Indian English have a series of aspirated voiceless 
stops in their native Indian language: why not just use the 
aspirated voiceless stops of one’s native language when uttering 
words like pit, tip and king? Some insight into this 
phenomenon may be gained if we consider that many of the 
native Indian languages lack dental fricatives; speakers of 
Indian English therefore engage in ‘TH-stopping’, in which 
they produce aspirated voiceless dental [t] for /0/ and the 
breathy voiced dental stop [d] for /5/. PerlTaps it is because a 
member of the aspirated voiceless stop series ([t h ]) is being 
used for /0/ that the unaspirated series is used in words which 
would otherwise have an aspirated voiceless stop in RP. 

Many of the native languages of India have a series of 
retroflex stops. Retroflex sounds are produced with the tip and 
blade of the tongue curled backwards, so that the underside of 
the tongue forms an articulation with the alveolar ridge. Stops 
formed this way include the voiceless retroflex stop [|j and the 
voiced retroflex stop [f]. Speakers of Indian English often 
realize /t/ and /d/ in this retroflex manner; this can often be 
heard in the pronunciation of waiters in Indian restaurants, who 
will often produce words such as chapati (an Indian flatbread) 
with a retroflex [f], and poppadum (a thin, crispy Indian bread) 
with a retroflex [c]J. You can hear this on Track 13.2 . 

The IV phoneme is not realized as a ‘dark’ (velarized) lateral 
in most Indian English. The /r/ phoneme is often realized as an 
alveolar tap, so that the word rap is pronounced as [rap]. The 
phonemes /v/ and /w/ are not always consistently distinguished 
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by speakers of Indian English. Many of the native languages of 
India have an approximant which is intermediate between the 
approximant [w] and the fricative [v]. The sound they have is 
a labio-dental approximant, in which the lower lip forms a 
stricture of open approximation with the upper teeth. This 
sound is transcribed as [n]. It is like a [w] in that it is an 
approximant, and it is like a [v] in that it is labio-dental. You 
can hear the contrast between [v], [n] and [w] (illustrated from 
the African language Isoko) on Track 13.3 . Speakers of Indian 
English, because they lack a /v/ vs Av/ distinction, will often 
produce [w] for [v], or a [n] for [v]. 

0 

Many of the native languages of India are not stress-timed, 
so Indian English is often spoken without the stress timing that 
is typical of so many varieties of English. Linked to this is the 
fact that the word stress often falls on the wrong syllable in 
Indian English. 


13.4 An Overview of Some 
Common Phenomena Found 
in Accent Variation 

It is clear from our brief description of the above varieties of 
spoken English that the speech of any speech community, and 
indeed of a given speaker within a community, is typically 
variable. It is such variation that can eventually lead to 
divergence between the speech of different communities over 
time. Several different factors are involved in such variation. A 
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host of sociolinguistic factors is relevant. These may include 
the sex, age or social class of the speaker, the speaker’s social 
aspirations, the structure of the society in which the speaker 
lives, and complex aspects of the social networks in which the 
speaker lives, involving such things as solidarity, group identity 
and individual identity. We have not examined these factors 
here, but we note that an understanding of them is vital if we 
are to gain a comprehensive understanding of variation in 
pronunciation in a given speech community. 

Variation in pronunciation is also constrained by factors other 
than societal ones, such as the nature of the vocal tract, the 
relationship between the phonemes within a phoneme system, 
and factors to do with the relative perceptual salience of 
sounds, depending on what sounds they are preceded or 
followed by, where they occur within syllable structure, and 
whether they occur in stressed or unstressed syllables. The 
following is a summing up of the sorts of phenomena we have 
discussed in describing accent variation. 

13.4.1 Vowel Phenomena 

13.4.1.1 Diphthongization 

Diphthongal realizations of vowel phonemes may be triggered 
by an adjacent consonant, as in the [i:o] realization of /i:/ 
before ‘dark F, or may occur ‘spontaneously’; we have seen 
many examples of the latter in General American, New York 
City English, General Australian, London English, RP and 
Tyneside English. Similarly, RP used to have monophthongal 
realizations of the mid vowels Id and /o/, but these have 
undergone diphthongization in the history of RP. 
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Diphthongization of these mid vowels has also taken place in 
the history of GA, but not in SSE. 


13.4.1.2 Monophthongization 

We have seen many examples of monophthongal realizations of 
diphthongs, as in [s:] instead of [so] and [o:] instead of [oo] in 
RP. 

13.4.1.3 Splits 

As we have seen, phonemes may have allophones, but those 
allophones may gain phonemic status if a pattern of 
complementary distribution is disrupted, resulting in a newly 
emergent pattern of parallel distribution. Examples include the 
emergence of the /o/-ending centring diphthong phonemes (/is/, 
/so/, /uo/), and the /u/ vs /a/ distinction (the FOOT/STRUT 
split) in RP and other accents of English. 

13.4.1.4 Mergers 

Since articulations may shift, it is possible for the realizations 
of one phoneme to merge with those of another, resulting in 
the loss of a phonemic distinction. Examples are the mergers of 
the /k/ vs /x/ and /ay/ vs /w/ contrasts in many accents of 
English, resulting in words which were previously minimal 
pairs becoming homophones, as in lock/loch and which/witch. 


13.4.1.5 Vowel Shifts 

When the realization of one phoneme encroaches on the 
realization of another vowel phoneme, ‘evasive action’ may be 
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taken, so that the phonemic contrast is sustained and merger is 
avoided. We have seen examples of this in both London 
English and Australian English. 

13.4.1.6 Vowel Reduction 

\bwels frequently reduce to either a weak form (e.g. I'd —> [i] 
and /u:/ —> [u]), or to schwa, in unstressed syllables which, as 
we have seen, are perceptually less salient. This phenomenon 
occurs in almost every variety of English. 

13.4.2 Consonantal Phenomena 

13.4.2.1 Weakening of Consonants 

A common phenomenon is intervocalic weakening, in which a 
consonant articulation becomes more vowel-like, in the sense 
of becoming voiced, or undergoing a diminution in degree or 
duration of stricture. Flapping in North American English, 
voicing of /t/ in Australian English and ‘T-to-R’ in Tyneside 
English are all examples of this phenomenon. 

In coda position, consonants often undergo weakening in the 
form of reduction in degree of stricture, sometimes leading to 
complete elision. Examples are the erosion and eventual loss of 
coda [i] in non-rhotic accents, and the reduction of voiceless 
stops to glottal stops, which occurs to some extent in all of the 
accents we have considered, \bcalization of coda IV, which 
occurs in London English, is another such process. 


13.4.2.2 Affrication of Stops 
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Th is phenomenon, in which /p/, /t/ and /k/ may be realized as 
[pO]/[pf], [t0]/[ts] and [kx] respectively, appears to be 
connected with strong aspiration. It has been attested in New 
York City, Liverpool and London. 

The principal point to be borne in mind about such processes 
is that they are rarely limited to specific accents of English. Nor 
are they limited to English. Because they arise from the nature 
of the human vocal tract, human perceptual capacities and the 
structure of human language phonologies, they are widely 
attested across the world’s languages. 


Notes 

1 There is often quite marked nasalization of the vowel in 
cases like these, where a nasal follows an open vowel. 

2 Some Tyneside speakers do in fact have a contrast, 
between [u] and an unrounded, sometimes centralized, 
version of [u], which we might transcribe as [i]. 

3 It has often been noted that auxiliary verbs which end with 
a tensing coda consonant, such as can and had , nevertheless 
do not undergo tensing. 


Exercises 

References for the exercises are given at the end of the 
chapter. 

1 London English 

Listen to Track 13.4 . This is a sound file from the IViE 
corpus (Intonational Variation in English: see Grabe, Nolan 
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and Farrar 1998 for details and further references), which 
contains recordings of UK speakers from London, 
Cambridge, Bradford, Leeds, Liverpool, Newcastle, 
Cardiff, Northern Ireland (Belfast) and the Republic of 
Ireland (Dublin). You can access the corpus at the 
following website: www.phon.ox.ac.uk/files/apps/IViE . 
Alternatively, enter ‘IViE’ on a search engine. The speaker 
is reading the first two paragraphs of the following 
passage: 

Once upon a time, there was a girl called Cinderella, but 
everyone called her Cinders. Cinders lived with her 
mother and two stepsisters called Lilly and Rosa. Lilly 
and Rosa were very unfriendly and they were lazy girls. 
They spent all their time buying new clothes and going 
to parties. Poor Cinders had to wear all their old hand- 
me-downs! And she had to do the cleaning! 

One day, a royal messenger came to announce a ball. 
The ball would be held at the Royal Palace, in honour 
of the queen’s only son, Prince William. Lilly and Rosa 
thought this was divine. Prince William was gorgeous, 
and he was looking for a bride! They dreamed of 
wedding bells! 

When the evening of the ball arrived, Cinders had to 
help her sisters get ready. They were in a bad mood. 
They’d wanted to buy some new gowns, but their 
mother said that they had enough gowns. So they 
started shouting at Cinders. ‘Find my jewels!’ yelled 
one. ‘Find my hat!’ howled the other. They wanted 
hairbrushes, hairpins and hairspray. 

(a) Is the speaker consistently non-rhotic? 

(b) In words which would have word-final schwa in 
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RP (such as Cinderella and Cinders ), what vowels do 
you find with this speaker? 

(c) Is there any evidence of ‘TH-fronting’ (the 
articulation of the dental fricative phonemes /0/ and /6/ 
as labio-dental fricatives [f ] and [v])? 

(d) This speaker exhibits /!/ vocalization. We have said 
that this occurs in the rhyme of the syllable in London 
English, but not in the coda. Is this true for this 
speaker? 

(e) In what contexts does this speaker produce glottal 
stop realizations of the /t/ phoneme? 

2 Tyneside English (Newcastle) 



Listen to Track 13.5 . This is also an IViE file. The 
speaker is reading all three paragraphs of the Cinderella 
passage in exercise 1. 

(a) North of England accents are said to lack the 
FOOT/STRUT split (i.e. the phonemic distinction 
between /u/ and /a/). Is this the case for this speaker? 
Provide examples of relevant words on the sound 
track. 

(b) In words written with <-er(s)> at the end, such as 
Cinders and mother, what vowel does this speaker 
utter in that final unstressed syllable? Is this what you 
would expect from a Geordie speaker? 

(c) Like many speakers of Tyneside English, the 
speaker has two realizations of the /ai/ diphthong. Can 
you transcribe them and give examples of words on the 
recording which contain them? 
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(d) In words such as clothes, does the speaker have a 
monophthongal or a diphthongal realization of the 
vowel? 

(e) Geordie speakers are said to have ‘clear’ 
(palatalized) [l r | in all syllabic contexts. Is this true for 
this speaker? 

(f) Cite examples of glottaling in the speech of this 
speaker. The speaker realizes /d/ as a glottal stop in 
one word: can you identify which word this is? 

(g) We said that the centring diphthongs /is/ and /os/ 
are often realized as /in/ and /on/ in Tyneside English. 
Can you hear any examples of this on the recording? 

3 Standard Scottish English (Glasgow) 



Listen to Track 13.6 . This is a recording from the PAC 
project (La Phonologie de Panglais contemporain; see 
Carr, Durand and Pukli 2004 for details). The PAC 
website is at: http JJ_ www.projet-pac.net/ 

The speaker is reading the following PAC word list: 

1. pit 

2. pet 

3. pat 

4. pot 

5. put 

6. putt 

7. sea 

8. say 

9. sigh 
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10. sue 

11. stir 

12. steer 

13. stairs 

14. err 

15. far 

16. war 

17. more 

18. pun- 

19. moor 

20. feel 

21. fill 

22. fell 

23. fall 

24. full 

25. fool 

26. fail 

27. foal 

28. file 

29. foul 

30. foil 

31. furl 

32. bird 

33. bard 

34. beard 

35. bared 

36. board 

37. barred 


357 



38. bored 

39. bode 

40. bowed 

41. bead 

42. bid 

43. bed 

44. bad 

45. bard 

46. pant 

47. plant 

48. master 

49. afterwards 

50. ants 

51. aunts 

52. dance 

53. farther 

54. father 

55. row 

56. rose 

57. rows 

58. pore 

59. poor 

60. pour 

61. paw 

62. paws 

63. pause 

64. pose 

65. wait 
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66. weight 

67. side 

68. sighed 

69. agreed 

70. greed 

71. brood 

72. brewed 

73. fir 

74. fair 

75. fur 

76. four 

77. fore 

78. for 

79. nose 

80. knows 

81. cot 

82. caught 

83. meat 

84. meet 

85. mate 

86. naught 

87. knot 

88. doll 

89. dole 

90. fierce 

91. bird 

92. scarce 

93. pert 
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94. start 

95. horse 

96. hoarse 

97. word 

98. gourd 

99. short 

100. sport 

101. next 

102. vexed 

103. leopard 

104. shepherd 

105. here 

106. there 

107. weary 

108. spirit 

109. marry 

110. Mary 

111. merry 

112. sorry 

113. story 

114. hurry 

115. jury 

116. bury 

117. berry 

118. heaven 

119. leaven 

120. earth 

121. berth 
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122. cook 

123. soot 

124. look 

125. room 

126. pearl 

127. peril 

(a) SSE is said to be rhotic, but there are speakers who 
are variably non-rhotic. Is this speaker consistently 
rhotic? 

(b) SSE is said not to have a contrast between long /u:/ 
and short /u/, of the sort found in RP. Instead, SSE is 
said to have a single /u/ phoneme, realized as the high 
central rounded vowel [u]. Is this true for the speaker 
on the recording? 

(c) Some SSE speakers have only a two-way contrast 
in words such as bird, curd and heard, with bird and 
curd having /a/, heard having /e/. Others have a three- 
way contrast, with hi in bird , /a/ in curd and Id in 
heard. Does this speaker have a two-way contrast in 
these ‘pre-r’ positions, or a three-way contrast? 

(d) SSE is said to possess a contrast between short lot, 
as in coat, and short hi, as in cot. It is said not to have 
a contrast, of the sort found in RP, between words of 
the lexical set LOT (which have a short hd in RP) and 
words of the sets THOUGHT and NORTH (with a 
long hJ in RP). Does this speaker exhibit any length 
difference between pairs such as cotl caught and 
knot/naught? 
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(e) The speaker is typically SSE in that she has two 
realizations of the /ai/ diphthong: [bc] and [Ai] Can you 
identify words containing these? 

(f) SSE speakers are said to have monophthongal Id 
and lol, distinct from the RP diphthongs /ei/ and loo/. 
Is this true for this speaker? 

(g) SSE speakers are said to have ‘dark 1’ ([!]) in all 
syllabic positions. Is this true for this speaker? 

(h) The speaker has both the approximant [j] and the 
tap [r]. Identify words containing these. 

4 West Texan English (Lubbock) 



Listen to Track 13.7 . This too is a PAC recording; the 
speaker is reading out the following written passage: 

I saw Hoyt on the news the other day. I can’t be sure 
whether I got the facts right. But I think he wants 
Lubbock farmers to plant soy, rye, and maybe rice for 
the new crop. I tuned in to the farm report last night. 
They say cotton prices will continue to rise. It sure is 
good that rot didn’t set in from all the rain. I tell you, 
the morning dew wasn’t much help either. I still think 
we will get back all our investment this year. Getting in 
the cotton crop this year was a challenge. I had my 
oldest boy at work with us since he isn’t in school 
anymore. He likes to work on the farm. But I feel he 
should work in a business like oil, cattle ranching, or 
maybe nothing to do with agriculture. He just turned 
nineteen. You know, he doesn’t play with toys like tin 
soldiers or his bow and arrow. He’s almost a man, and 
quite a fine one at that. There’s this story about my boy 
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that I still do like to tell. It’s a tale that really shows how 
he turned everything all around. My son Roy was about 
ten years old at the time. His friend Tom came over to 
play. They decided to steal my neighbor’s dune buggy 
and to go for a joy ride in it. They went back behind 
this cotton field onto a horse trail. Well, that dune buggy 
filled up, with first one kid and then another. Tom was 
at the wheel driving this contraption. They didn’t drive 
too far when they hit a big brass nail, from the railroad 
or something. The tire went flat, they hit a holding pen 
and then they all fell head first into ashes and dried mud 
when the buggy crashed. Those kids didn’t have good 
sense at all about what they were doing. My neighbor 
Ken had to sell the dune buggy at a loss after that. I 
know I sent Roy to bed without dinner, and that was 
the least of his punishments. I told my son he had a 
duty to pay back the damages. He toiled all summer in 
Ken’s tack room so he could do right by him. My boy 
gave up swimming at the community pool that whole 
hot summer. He used the cash to pay back Ken. Did 
Roy pull something like that again? No he did not. It’s 
been nine years since that happened. He learned to 
choose his friends and his actions more carefully. I 
never used a lash or hit him, but he learned to toe the 
line and not to steal or he ever again. And the law left 
him alone because he paid Ken back. He’s not a bad 
boy, my Roy, and he didn’t fail to make good choices 
other times. Just once he thought he’d like to take out a 
hot rod one day and got lucky that his mistake didn’t 
ruin his life. Roy has now become a good role model 
for Beau, my middle son, and for Luke, the youngest, 
(a) Is the speaker rhotic or non-rhotic? Cite examples 
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from the recording. 

(b) This speaker has variable ‘smoothing’ 
(monophthongization) of the /ai/ diphthong (in words 
of the lexical set PRICE), resulting in an [a:] 
pronunciation. Give examples of the variability. 

(c) The speaker also has variable ‘smoothing’ of the / 
oi/ diphthong (in words of the lexical set CHOICE), 
resulting in an [o:] pronunciation. Give examples. 

(d) Can you hear any examples of Southern Breaking 
(diphthongization) in the vowels /sc/, /&/ and /i/? 

(e) Many varieties of Texan English are said to exhibit 
(variably) neutralization of the III vs /el distinction 
before coda Ini. Is this true for this speaker? 

5 Australian English 



Listen to Track 13.8 . This is a PAC recording. The 
speaker is reading the same word list as the SSE speaker 
in exercise 3. 

(a) Australian English is said to be non-rhotic. Is this 
true for this speaker? 

(b) Australian English is said to have ‘dark 1’ in all 
syllabic positions. Is this true for this speaker? 

(c) Australian English is said to have Flapping for 
intervocalic /t/. Is this present on the recording? Is it 
variable? 

(d) In our discussion of Australian English, we said that 
the vowel phoneme system was the same as that of 
RP, but that there were vowel shift phenomena. Are 
any of these present on the recording? 

6 Indian English (Mumbai/Bombay) 
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Listen to Track 13.9 . Thi s is a PAC recording. See word 
list 2 on the PAC website. 

(a) The speaker is variably non-rhotic. Give examples 
which demonstrate both rhotic and non-rhotic 
pronunciations. 

(b) Cite examples of TH-stopping on the recording. 

(c) Is there aspiration of the word-initial /p/ phonemes 
in words 1 -6? 

(d) Does the speaker have ‘clear’ (palatalized) 
realizations of the IV phoneme? 

(e) How does the speaker realize the /r/ phoneme? 

7 Comparison of the accents in exercises 1-6 with RP and 
GA 

0 

Listen to Tracks 13.10 and 13.11 . These are PAC 
recordings of an RP and a GA speaker. They are reading 
the following passage from the PAC project: 

Christmas interview of a television evangelist 

If television evangelists are anything like the rest of us, 
all they really want to do in Christmas week is snap at 
their families, criticize their friends and make their 
neighbours’ children cry by glaring at them over the 
garden fence. Yet society expects them to be as jovial 
and beaming as they are for the other fifty-one weeks 
of the year. If anything, more so. 

Take the Reverend Peter ‘Pete’ Smith, the ‘TV vicar’ 
who sends out press releases in which he describes 
himself as ‘the man who has captured the spirit of the 
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age’. Before our 9 a.m. meeting at his ‘media office’ on 
Crawshaw Avenue, South London, he faced, he says, a 
real dilemma. Should he make an effort ‘to behave like 
a Christian’ - throw his door open, offer me a cup of 
tea - or should he just play it cool, study his fingernails 
in a manner that showed bored indifference and get rid 
of me as quickly as possible? In the end, he did neither. 

‘As a matter of fact, John’, he says in a loud Estuary 
English twang, ‘St Francis said, “At all times preach the 
gospel and speak whenever you have to.” ’ But hey, he 
didn’t mean “Be on your best behaviour and be happy 
all the time.” I could have been extra-polite to you, but 
the real me would have come out as I was talking. You 
cannot disguise what you are.’ 

‘And what are you then, Pete?’ 

‘Well, Em a Christian, John. I’ve been one since I 
was 14. And I know for sure that Christianity will be 
judged more on who you are rather than what you have 
to say about it. Many church leaders don’t appear to 
understand this. They think we can only be really 
Christian when we are ramming the doctrine of the 
Creation down people’s throats. But if you try to force- 
feed people they get sick of it and think you’re a pain. 
It’s seen as the job of a Christian leader to wear a dog- 
collar and dress in purple and always be talking about 
the real meaning of the New Testament. In reality, that 
turns people right off! ’ 

In many ways, ‘Pete’ Smith looks exactly how you’d 
expect a high-profile, born-again Christian to look: tall, 
handsome, clean-cut and evenly sun-tanned. He has 
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those scarify white teeth that TV evangelists tend to 
have, and he doesn’t wear a dogcollar. In fact, when 
doing his various religious programmes on Sunday 
mornings, he has been known to wear a black leather 
jacket instead, in casual mode. Today, the look is more 
business-like: metal-rimmed glasses, a grey suit, a blue 
open-neck shirt, and fashionable black shoes with large 
buckles. Smith is 44 but he looks a mere 24. 

During the whole interview, there wasn’t any talk of 
the poor or the needy but only of his forthcoming trip to 
China in February and the masses waiting for his 
message there. I ventured a few questions relating to 
the charity trust he founded some ten years ago and 
which, it is generally agreed, employs eight hundred 
staff and runs schools, hospitals and hostels around the 
world. And what about the gambling organization he 
has been willing to advise? Is that a temporary activity 
or might it be true that he has accepted to be paid to sit 
on its Board of Directors? Which side is religion on 
these days? Does money matter? It was as if I had 
launched a few missiles in his direction. He just sighed 
in answer: ‘I’m only human, John. God knows I do my 
best and often fail. But it’s no skin off my nose if our 
enemies sneer at some of the good work we do. Truth 
will out. ’ 

(a) Listen to the realizations of the /r/ phoneme and the 
IV phoneme in all of the accents in exercises 1-7. What 
differences do you find? 

(b) Listen to the vowel in words of the lexical set LOT 
(e.g. not, hot, cot). What differences do you find? 
Many American speakers lack the rounded vowel [d] 
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in words of the lexical set LOT, in words such as 
possible and dog. Instead, they are said to have an 
unrounded [a] vowel, so that words such as pocket 
sound like packet to British hearers. Is this true for the 
GA speaker? 

(c) Which accents exhibit ‘happY Tensing’? Recall that 
this is the short, tense [i] pronunciation in the final, 
unstressed, syllable of words such as happy, lucky, 
slowly. 

(d) Listen to words which have word-final schwa in 
RP (e.g. leader, Cinderella). What vowels do you find 
in such words among the various speakers? 

(e) Among the non-rhotic accents, RP is said to have 
the centring diphthongs /io/, /so/ and /uo/, in words of 
the lexical sets NEAR, SQUARE and CURE. What 
realizations of these phonemes do you find among the 
non-rhotic speakers? 
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For a more detailed account of articulatory phonetics, see D. 
Abercrombie (1967) Elements of General Phonetics, 
Edinburgh: Edinburgh University Press, and J. C. Catford 
(1988) A Practical Course in Phonetics, Oxford: Clarendon 
Press. An introduction to phonetics which concentrates mostly 
on English is J. D. O’Connor (1973) Phonetics, London: 
Penguin. For an introduction to phonetics geared towards 
English, and with a good introductory coverage of acoustic 
phonetics, see P. Ladefoged (2010) A Course in Phonetics 
(sixth edition, with Keith Johnson), New York: Harcourt Brace 
Jovanovich. For standard descriptions of the RP accent, see A. 
C. Gimson (1993) An Introduction to the Pronunciation of 
English, London: Arnold (third edition, ed. by A. Cruttenden). 

For an introduction to English phonetics and phonology 
which covers, in much greater detail, some of what we have 
covered here, and more, see H. Giegerich (1992) English 
Phonology: An Introduction, Cambridge: Cambridge 
University Press. 

For an introduction to phonological theory, see any of the 
following: P. Carr and J.-P. Montreuil (2012) Phonology 
(second edition), London: Palgrave Macmillan; J. Durand 
(1990) Generative and Non-Linear Phonology, London: 
Longman; F. Katamba (1988) An Introduction to Phonology, 
London: Longman; C. Gussenhoven and H. Jacobs (2005) 
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Understanding Phonology (second edition), London: Arnold; 
R. Lass (1984) Phonology, Cambridge: Cambridge University 
Press; A. Spencer (1996) Phonology, Oxford: Blackwell; I. 
Roca and W. Johnson (1999) A Course in Phonology, Oxford: 
Blackwell. 

Students may proceed from one of these textbooks to more 
advanced treatments of phonological theory, such as J. 
Goldsmith (1989) Autosegmental and Metrical Phonology, 
Oxford: Blackwell; M. Kenstowicz (1994) Phonology in 
Generative Grammar, Oxford: Blackwell; and R. Kager (1999) 
Optimality Theory, Cambridge: Cambridge University Press. 
For an approach to English phonetics and phonology from the 
viewpoint of the theory known as government phonology, with 
good coverage of varieties of British and American English, see 
J. Harris (1994) English Sound Structure, Oxford: Blackwell. 

The account of English word stress given here owes a great 
deal to the work of Lionel Guierre and his followers, especially 
A. Dechamps, J.-L. Duchet, M. Fournier and M. O’Neill 
(2004) English Phonology and Graphophonemics, Paris: 
Ophrys. For detailed coverage of English word stress, see E. 
Fudge (1984) English Word Stress, London: Allen and Unwin. 

For a general introduction to the study of intonation, see A. 
Cruttenden (1986) Intonation, Cambridge: Cambridge 
University Press. See too D. Crystal (1969) Prosodic Systems 
and Intonation in English, Cambridge: Cambridge University 
Press. An excellent recent textbook coverage of English 
intonation is J. C. Wells (2006) English Intonation: An 
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Introduction, Cambridge: Cambridge University Press. The 
accompanying CD is most useful. 

A useful book on British accents and dialects is A. Hughes 
and P. Trudgill (1987) English Accents and Dialects (second 
edition), London: Arnold. For an extensive description of the 
phonetics and phonology of a very wide range of English 
accents worldwide, see J. C. Wells’s (1982) three-volume work 
Accents of English, Cambridge: Cambridge University Press. I 
have adopted Wells’s three-way distinction (derived from 
Trubetzkoy) between systemic, realizational and lexical 
distributional differences between accents. For further textbook 
discussion of accent variation based on this tripartite 
distinction, see the relevant parts of Giegerich (1992; see 
above) and, of course, Wells (1982). I have followed Giegerich 
(1992) in comparing and contrasting SSE with RP and GA (for 
the simple reason that SSE is the medium in which I teach 
English phonetics and phonology). The reader should consult 
Giegerich (1992) for similar sorts of discussion. For an 
excellent introduction to phonetics and phonology in general, 
and varieties of English, see B. Collins and I. M. Mees (2008) 
Practical Phonetics and Phonology (second edition), London: 
Routledge. As with the Wells book on intonation, the 
accompanying CD is very useful. For further reading on 
specific varieties of English, see J. Durand (2004) ‘English in 
early 21st century Scotland: a phonological perspective’, La 
Tribune Internationale des Langues Vivantes 36:87-105; A. 
Przewozny (2004) ‘Variation in Australian English’, La Tribune 
Internationale des Langues Vivantes 36:74-86; and D. Watt 
and L. Milroy (1999) ‘Patterns of variation and change in three 
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Newcastle vowels: is this dialect leveling?’ in G. Docherty and 
P. Foulkes (eds.), Urban Voices: Accent Studies in the British 
Isles, London: Arnold, pp. 25-46. 

It is as well to point out to any reader who wishes to pursue 
the further reading suggested here that the choice of symbols 
used to represent the vowel phonemes of various accents of 
English will almost certainly vary from one author to another. 
Thi s is inevitable, since there is a necessary degree of 
arbitrariness built into such choices. However, the reader 
should not find it too demanding to work out the 
correspondences between the symbols. The main point, for 
non-native speakers, is that much of the literature, and also the 
pronouncing dictionaries, use the symbol /e/ for words such as 
‘dress’ in RP and GA, whereas we have used the phonetically 
more accurate epsilon symbol ‘s’, in both phonemic and 
phonetic representations. We have used that same symbol for 
the RP centring diphthong in words of the sort ‘square’: /so/. 
This helps us show that the current monophthongal realization 
in RP is the long, low-mid vowel [s:], as in [skws:] (‘square’). 
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